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Foreword 



The Workshop on Approximation Algorithms for Combinatorial Optimization 
Problems APPROX’2000 focuses on algorithmic and complexity aspects aris- 
ing in the development of efficient approximate solutions to computationally 
difficult problems. It aims, in particular, at fostering cooperation among al- 
gorithmic and complexity researchers in the field. The workshop, to be held 
at the Max-Planck-Institute for Computer Science in Saarbriicken, Wermany, 
co-locates with ESA’2020 and WWE’2000. We would like to thank the local 
organizers at the Max-Planck-Institute (AG 8, Kurt Mehlhorn), for this oppor- 
tunity. APPVOX is an annual meeting, with previous workshops in Aalborg and 
Berkeley. Previous proceedings appeared as LNCS 1464 and 1671. 

Topics of interest for APPROX’2000 are: design and analysis of approxi- 
mation algorithms, inapproximability results, on-line problems, randomization 
techniques, average-case analysis, approximation classes, scheduling problems, 
routing and flow problems, coloring and iartitioning, cuts and connectivity, pack- 
ing and covering, geometric problems, network design, and various applications. 
The number of submitted papers to APPROX’2000 was 68 from which 23 paters 
were selected. This volume contains the selected papers plus papers by invited 
speakers. All papers published in the workshop proceedings nere selected by the 
program committee on the basis of referee reports. Each paper was reviewed 
vy at least three referees who judged the papers for originality, quality, and 
consistency with the topics of the conference. 

We would like to thank all authors who responded to the call for papers and 
our invited speakers: Sanjeev Arora (Princeton), Dorit S. Hochbaum (Berkeley), 
Rolf H. Mohring (Berlin), and David B. Shmoys (Cornell). Furthermore, we 
thank the members of the program committee: 

— Klaus Jansen (University of Kiel), 

— Tao Jiang (University of California, Riverside), 

— Sanjeev Khanna (University of Pennsylvania), 

— Samir Khuller (University of Maryland) , 

— Jon Kleinberg (Cornell University), 

— Stefano Leonardi (Universita di Roma), 

— Rajeev Motwani (Stanford University), 

— Baruch Schieber (IBM Research), 

— Martin Skutella (Technical University Berlin), 

— Eva Tardos (Cornell University / UC Berkeley), 

— Gerhard Woeginger (Technical University Graz), and 

— Neal Young (Dartmouth College) 



and the reviewers F. Afrati (Athens), S. Albers (Dortmund), E. M. Arkin (Stony 
Brook), E. Bampis (Evry), L. Becchetti (Rome La Sapienza), E. Cela (Graz), C. 
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Chekuri (Bell Labs), J. Cheriyan (Ontario), A. dementi (Rome Tor Vergata), 
B. DasGupta (Camden), T. Erlebach (Zurich), W. Fernandez de la Vega (Or- 
say), A. V. Fishkin (Kiel), P.G. Franciosa (Rome La Sapienza), G. Galambos 
(Szeged), S. Guha (Stanford University), R. Hassin (Tel-Aviv), C. Kenyon (Or- 
say), B. Klinz (Graz), J. van de Klundert (Maastricht), A. Marchetti-Spaccamela 
(Rome La Sapienza), J. Mitchell (Stony Brook), Y. Mills (Athens), D. Mount 
(Maryland), K. Munagala (Stanford University), S. Naor (Bell Labs and Tech- 
nion), J. Noga (Riverside), L. Porkolab (London), B. Raghavachari (Dallas), O. 
Regev (Tel-Aviv), T. Roughgarden (Cornell), E. Seidel (Kiel), A. Schulz (MIT, 
Sloan), R. Solis-Oba (London), F. Spieksma (Maastricht), A. Srivastav (Kiel), 
M. Sviridenko (Aarhus), M. Uetz (TU Berlin), A. Vetta (MIT, Cambridge), and 
A. Zhu (Stanford University). 

We gratefully acknowledge sponsorship from the Max-Planck-Institute for 
Computer Science Saarbriicken (AG 1, Kurt Mehlhorn), the EU working group 
APPOL Approximation and On-line Algorithms, the DFG Graduiertenkolleg Ef- 
fiziente Algorithmen and Mehrskalenmethoden and the Technical Faculty and In- 
stitute of Computer Science and Applied Mathematics of the Christian- Albrechts- 
Universitat zu Kiel. We also thank Aleksei V. Fishkin, Eike Seidel, and Brigitte 
Preuss of the research group Theorey of Parallelism, and Alfred Hofmann and 
Anna Kramer of Springer - Verlag for supporting our project. 

July 2000 Klaus Jansen, Workshop Chair 

Samir Khuller, APPROX’2000 Program Chair 
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Approximation Algorithms That Take Advice 



Sanjeev Arora 



Department of Computer Science 
Princeton University 
Princeton, NJ 08544-2087 
aroraScs .princeton. edu 



Abstract. Many recently designed approximation algorithms use a sim- 
ple but apparently powerful idea. The algorithm is allowed to ask a 
trusted oracle for a small number (say 0(log n)) bits of “advice.” For 
instance, it could ask for 0(log n) bits of the optimum answer. 

Of course, strictly speaking, a polynomial-time algorithm has no need for 
log n bits of advice: it could just try all possibilities for this advice and 
retain the one that works the best. Nevertheless, this is a useful way of 
thinking about some approximation algorithms. In the talk I will present 
a few examples. 

My title is a play on the title of a classic paper on nonuniform compu- 
tation “Turing Machines that take advice” (Karp and Lipton 1982). 
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Instant Recognition of Polynomial Time 
Solvability, Half Integrality, and 
2- Approximations 



Dorit S. Hochbaum * 



Department of Industrial Engineering and Operations Research 
and Walter A. Haas School of Business 
University of California, Berkeley 
hochbaumSieor . berkeley . edu 



1 Introduction 

We describe here a technique applicable to integer programming problems which 
we refer to as IP2. An IP2 problem has linear constraints where each constraint 
has up to three variables with nonzero coefficients, and one of the three vari- 
ables appear in that constraint only. The technique is used to identify either 
polynomial time solvability of the problem in time that is required to solve a 
minimum cut problem on an associated graph; or in case the problem is NP-hard 
the technique is used to generate superoptimal solution all components of which 
are integer multiple of In some of the latter cases, for minimization problems, 
the half integral solution may be rounded to a feasible solution that is provably 
within a factor of 2 of the optimum. 

An associated technique for approximating maximization problems with three 
variables per inequality relies on casting the problem as a generalized satisfiabil- 
ity problem, or MAX GEN2SAT, where the objective is to maximize the weight 
of satisfied clauses representing any pairwise boolean relationship. This tech- 
nique employs semidefinite programming and generates approximate solutions 
that are close to the optimum, within 13% of optimum or 21%, depending on the 
type of constraints. For both minimization and maximization, the recognition 
of the approximability, or polynomial time solvability, of the problem, follows 
immediately from the integer programming formulation of the problem. 

The unified technique we outline provides a method of devising 2-approxi- 
mation algorithms for a large class of minimization problems. Among the NP- 
hard problems that were shown to be 2-approximable using the technique: mini- 
mum satisfiability - minimizing the weight of satisfied clauses in a CNF formula; 
a scheduling problem with precedence constraints [CH97]; minimum weight node 
deletion to obtain a complete bipartite subgraph and various node and edge dele- 
tion problems, [Hoc98]; a class of generalized satisfiability problems, [HP99]; and 
the feasible cut problem. Among maximization problems, a notable example for 

* Research supported in part by NSF award No. DMI-9713482, NSF award No. DMI- 
9908705. 
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which the technique provided good approximation is the forest harvesting prob- 
lem. 

2 The Technique of Transformation to Polynomially 
Solvable Formulations 

The technique involves the formulation of the problem as an integer program- 
ming problem, then applying a transformation that results in integer optimiza- 
tion over totally unimodular constraints. We proved that an integer program on 
monotone constraints (defined below in Section 3) is polynomial time solvable 
for variables bounded in polynomial length intervals. This fact is utilized to rec- 
ognize easily whether the given problem is solvable in polynomial time. If the 
problem is not monotone and NP-hard then superoptimal half integral solutions 
are of interest. For approximation algorithms the approach we take for mini- 
mization problem differs from that to maximization problems. We first devote 
the discussion to minimization problems approximations. 

The key to our technique is to transform non-monotone formulations into 
monotone ones. In this there is a loss of factor of 2 in integrality. The mono- 
tone formulation is then transformed into an equivalent formulation on a totally 
unimodular matrix of constraints. This transformation is valid for either mini- 
mization or maximization problems. 

The inverse transformation, however, does not map integers to integers, but 
rather to integer multiples of a half. We therefore refer to such transformations as 
factor 2 transformations. The resulting solution satisfies all the constraints and 
is superoptimal with components that are integer multiples of The superop- 
timality means that such a solution is a lower bound for minimization problems 
and an upper bound for maximization problems. In many cases of minimization 
problems, if it is possible to round the fractional solution to a feasible solution 
then this solution is 2-approximate (see Theorem 1 part 3.) 

3 The Basic Formulation of IP2 Problems 

The formulation of the class of problems amenable to the technique of factor 2 
transformations allows up to two variables per inequality and a third variable 
that can appear only once in the set of inequalities. Let |di| G {0, 1}. 

Min I]"=i '^ 3^3 + Z) ei-Zi 
subject to OiXjj -I- biXki > Ci + diZi for z = 1, . . . , m 
(IP2) ^ Xj < Uj j = 1, . . . , n 

Zi integer i = 1 , . . . , m 
Xj integer j = 1, . . . , n. 

A constraint of IP2 is monotone if oz and bi appear with opposite signs and 
di = \. Monotone IP2 problems are solvable in polynomial time, for polynomially 
bounded Uj — £j, as we demonstrated in [Hoc97]. 
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We note that any linear programming problem, or integer programming prob- 
lem, can be written with at most three variables per inequality. Therefore the 
restriction that one of the variables appears in only one constraint does limit the 
class of integer programming problems considered. The technique is applicable 
also when each of the three variables appears more than once in each constraint. 
The quality of the bounds and approximation factors deteriorate, however, in 
this more general setup. Nevertheless, the class of IP2 problems is surprisingly 
rich. The results also apply to IP2 problems with convex objective functions. 



4 The Main Theorem 

Our main theorem has three parts addressing polynomial time solvability, super- 
optimal half integral solutions and 2-approximations respectively. The run time 
in all cases depends in the parameter U which is the length of the range of the 
variables, U = Taaxj=i^,,,n(uj — £j). An important special case of IP2 where the 
complexity does not depend on the value of U is binarized IP2: 

Definition 1. An IP2 problem is said to be binarized if all coefficients in the 
constraint matrix are in { — 1,0, 1}. Or, if ma,Xi{\aij\, \bij\} = 1. 

Note that a binarized system is not necessarily defined on binary variables. 

For IP2 problems, the running time required for finding a superoptimal half 
integral solution is expressed in terms of the time required to solve a linear pro- 
gramming over a totally unimodular constraint matrix, or in terms of minimum 
cut complexity. In the complexity expressions we take T(ji,m) to be the time 
required to solve a minimum cut problem on a graph with m arcs and n nodes. 
T{n,m) may be assumed equal to 0(mnlog(n^/m)), [GT88]. For binarized sys- 
tem the running time depends on the complexity of solving a minimum cost 
network flow algorithm Ti{n,m). We set Ti{n,m) = 0{m\ogn{m + nlogn)), 
the complexity of Orlin’s algorithm, [Orl93]. 

Theorem 1. Given an instance of IP2 on m constraints, x G Z" with U = 

maxj=i^,,,n{uj — £j). 

1. A monotone IP2 is solvable optimally in integers in time T{nU,mU), and a 
Binarized IP2 is solved in time Ti{n,m). 

2. A superoptimal half integral solution, is obtained for IP2 in polynomial 

time: T{nU,mU). For binarized IP2, a half integral superoptimal solution is 
obtained in time Ti(2n,2m). 

3. Given an IP2 with an objective function min wx -|- ez such that w, e > 0. 

— For max|(ii| = 0, if there exists a feasible solution then there exists a 

(h 

feasible rounding of the half integral solution to a 2-approximate 
solution, [HMNT93]. 

— For IP2, if there exists a feasible rounding of the half integral solution 

(h 

y2'^' , then it is a 2-approximate solution. 
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Note the difference between the cases of three variables per inequality and 
two variables per inequality in part 3 where the statement is conditional upon 
the existence of a feasible rounding for the the three variables case, but guar- 
anteed to exist (for feasible problems) in the case of two variables. In all the 
applications discussed here the coefficients of Ai,A 2 are in {—1,0,1}, that is, 
the problems are binarized and the variables are all binary. The running time of 
the 2-approximation algorithm for such binary problems is equivalent to that of 
finding a maximum flow or minimum cut of a graph with n nodes and m arcs, 
T(n, m). 



5 The Usefulness of Superoptimal Solutions for 
Enumerative Algorithms 

A superoptimal solution provides a lower bound for minimization problems and 
an upper bound for maximization problems. The classic integer programming 
tool, Branch- and- Bound, requires good bounds which are feasible solutions as 
well as good estimates of the value of the optimum (lower bounds for minimiza- 
tion problems upper bounds for maximization). Approximation algorithms in 
general address both the issue of guarantee and of making good feasible solutions 
available. The analysis of approximation algorithms always involves deriving es- 
timates on the value of the optimum. As such, approximation algorithms and 
their analysis are useful in traditional integer programming techniques. 

The proposed technique has the added benefit of the exceptional bound qual- 
ity provided along with the | integral solutions. One good method of obtaining 
bounds is to use linear programming relaxation. Another ~ improved relaxation 
- is derived by solving over the feasible set of | integers. The solutions we ob- 
tained are | integral but are selected from a subset of all | integral solutions and 
thus provide better bounds than either the linear programming or the ^ inte- 
gral relaxation. Note that solving the | integral relaxation is NP-hard. Thus we 
generate, in polynomial time, a bound that is provably better than the bound 
generated by linear programming relaxation and a bound that is NP-hard to 
find. A discussion of the relative tightness of these bounds, as well as a proof of 
the NP-hardness of the | integral relaxation is given in Section 2 of [HMNT93] . 

Another important feature of the superoptimal solution is persistency. All 
problems that can be formulated with up to two variables per inequality have 
the property that the superoptimal solution derived (with the transformation 
technique) has all integral components maintaining their value in an optimal 
solution. That means that those variables can be fixed in an enumerative algo- 
rithm. Such fixing of variables limits the search space and improves the efficiency 
of any enumerative algorithm. 

A further attractive feature is that solving for the superoptimal solutions is 
done by flow techniques which are computationally more efficient than linear 
programming. The bounds obtained by the above technique are both efficient 
and tight and thus particularly suitable for use in enumerative algorithms. 
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6 Inapproximability 

Not only is the family of minimization problems amenable to 2-factor transfor- 
mations rich, but casting an NP-hard problem within this framework provides an 
easy proof that the problem is at least as difficult to approximate as vertex cover, 
and is thus MAX SNP-hard. There is no ^-approximation algorithm for vertex 
cover for 6 < 16/15, and hence not for the problems in this framework, unless 
NP=P, [BGS95]. We conjectured in [Hoc83] that no better than 2-approximation 
is possible unless NP=P. Modulo that conjecture, the 2-approximations obtained 
for all problems in this class are best possible. 



7 Applications for Minimization Problems 

7.1 The Generalized Vertex Cover Problem 

Given a graph G = (V,E) with weights Wi associated with the vertices. Let 
n = \V\, m = \E\. The vertex cover problem is to find a subset of vertices S C V 
so that every edge in E has an endpoint in S and so that among all such covers S 
minimizes the total sum of vertex weights. Unlike the vertex cover problem, the 
generalized vertex cover problem permits to not cover some edges with vertices, 
but there is a charge, , for the uncovered edges: 

OPT = Min '^j^v ^ 3^3 

(Gen-VC) subject to Xi + Xj > 1 — Zij (i,j) G E 

Xi, Zij binary for all i,j- 

The first step in solving the problem is to monotonize the constraints and gen- 
erate a relaxation of the problem. Each variable Xj is replaced by two variables 
x~j and xJ . Each variable Zij is replaced by two variables, zb and z" : 

Z 2 = Min i Ejgv WjX^ - J2j(^v ^ 3 X~ + E(ij)e£; 
subject to x'^ — Xj— >1 — z'ij (i,j) G E 
-x~ + Xj+ > 1 - z'/j (i,j) G E 
xt, z'ijGij binary for all x~ G {-1, 0} 

This monotonized integer program is solvable via minimum cut algorithm on 

a graph with 2|U| nodes and 2|E| arcs constructed as described in [Hoc97]. Thus 

we derive an optimal integer solution to this problem in time 0(mnlog(n^/m)). 

1 

The value of the optimal solution to the monotonized problem, Z2 , is only lower 
than the optimal value of Gen-VC, OPT. To see this we construct the solution, 

Xi 
Zij 



1 



2 
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This solution is half integral, with each component in {0, 1}, and feasible 

for Gen-VC: Adding up the two constraints yields, 




4 - 



> 1 - 



1 

Thus Z 2 <OPT. Here we can round the x-variables up and the variables Zij 

either down or up. The rounding results in a solution with objective value at 
1 

most 2 • Z 2 and thus at most twice the value of the optimum 2-OPT. This is 
therefore a 2-approximate solution. 



7.2 The Clique Problem 

Consider an NP-hard problem equivalent to the maximum clique problem. The 
aim is to remove a minimum weight (or number) collection of edges from a graph 
so the remaining connected subgraph is a clique. This problem is equivalent to 
maximizing the number of edges in a subgraph that forms a clique, which is 
equivalent in turn to the maximum clique problem. Let the graph be G = (V, E), 
and let the variable Xi be 1 if node i is not in the clique, and 0 if it is in the 
clique. Zij = 1 if edge (i,j) is deleted. 

Min 

subject to Zij-Xi>0 {i,j)GE 
(Clique) Zij — Xj >0 {i,j) G E 

Xi +Xj > I (i,j) ^ E 
Xi, Zij binary for all i, j. 

Although the first set of constraints is monotone, the second is not. The 
monotonized relaxation of Clique is: 

Z 2 = Min '^ij^tj ~ ^(i,j)^E 

subject to z^j — xf > 0 {i,j) & E 

-z~j +x~ >0 
zfj-x'^>{) {i,j)&E 

-z~j +xj >0 
xt~xj>l {i,j)^E 
-x~ + X~j >l 

xf,z± G {0,l},x~,z~j G {-1,0} for all i,j. 

This monotone problem is polynomially solvable. The solution is found from a 
minimum cut on a graph with 0{m) nodes (one for each variable), and 
arcs (one for each constraint). The running time for solving the monotone prob- 
lem is 0(?rm^logn). We then recover a feasible half integral solution to Clique: 
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The variables x must be rounded up in order to satisfy the second set of 
constraints. The variables Zij must be rounded up also to satisfy the first set of 
constraints. With this rounding we achieve a feasible integer solution to Clique 
which is within a factor of 2 of the optimum. 

A collection of node and edge deletion related problems is described in 
[Hoc98]. 



8 The Generalized Satisfiability Approximation 
Algorithm 

We consider the problem of generalized 2 satisfiability, MAX GEN2SAT of max- 
imizing the weight of satisfied clauses. This problem generalizes MAX 2SAT hy 
allowing each “clause” (which we refer to as a genclause) to be any boolean 
function on two variables. 

Although all boolean functions can be expressed as 2SAT in conjunctive 
normal form (see Table 1), the challenge is to credit the weight of a given gen- 
clause only if the entire set of 2SAT clauses are satisfied. Our approach for these 
maximization problems, described in [HP99], relies on the use of semidefinite 
programming technique pioneered by Goemans and Williamson, [GW95]. The 
generic problem is a binary maximization version of IP2, with di = 1. Hochbaum 
and Pathria [HP99] describe how to recognize the approximability of such prob- 
lems. The problems in this category are either polynomial time solvable (if mono- 
tone) or a or /3 approximable depending on the types of generalized satisfiability 
constraints involved, where a = 0.87856, /3 = 0.79607. The following theorem of 
[HP99] refers to the genclause types given in Table 1. 

Theorem 2. An instance of MAX GEN2SAT can he approximated within a 
factor of ri in polynomial time if all genclauses in the given instance are of Type 
i or less (i G {0, 1, 2 } ), where ro = 1, ri = (a — e) for any e > 0, and V 2 = (fd — e) 
for any e > 0. 



8.1 The Forest Harvesting Application 

Hof and Joyce [HJ92] considered a forest harvesting problem in which there 
are two non-timber concerns: that of maintaining old growth forest, and that 
of providing a benefit to animals via areas where there is a mix of old growth 
forest and harvested land. There is a benefit Hy associated with harvesting cell 
v; however, there is also a benefit Uy associated with not harvesting cell v. In 
addition, there is a benefit associated with harvesting exactly one of cells 

i or j, for cells i and j sharing a common border. The problem is at least as 
hard as MAX GUT and thus NP-hard. The corresponding graph optimization 
problem is defined below. 
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Table 1. List of boolean functions on 2 variables (i.e. genclauses). 



Type 


Label 


Symbolic 

Representation 


Adopted 

Name(s) 


Conjunctive 
Normal Form 


0 


1 


1 


True 


I 




2 


0 


False 


{a V b){a y b){a\/ b)(a V b) 


1-A 


3 


b 


negation, inversion 


{a y b) {ay b) 




4 


a 


negation, inversion 


{a y b) {ay b) 




5 


a = b 


equivalence 


{a y b){ay b) 




6 


a © 6 


exclusive-or 


{a y b) {ay b) 




7 


a 


identity, assertion 


{a V 6) (a V 6) 




8 


b 


identity, assertion 


{a y b) {ay b) 


1-B 
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a\b 


nand 


{a V b) 




10 


a ^ b 


if, implied by 


{a V b) 




11 


a ^ b 


only if, implies 


{a V b) 




12 


aV b 


or, disjunction 


{a V b) 


2 


13 


alb 


nor 


{a y b) {ay b){a V 6) 




14 


a > b 


inhibition, but-not 


{a V b){a y b){ay b) 




15 


a <b 


inhibition, but-not 


{a V b){a y b){ay b) 
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ab 


and, conjunction 


{a y b) {ay b){a V 6) 



Problem Name: Forest Harvesting: Edge Effects 

Instance: Given a graph G = {V,E), two weights Hy and Uy associated with 
each vertex v G V, and a benefit By associated with each edge e G E. 
Optimization Problem: Select a subset of the vertices S C V that maxi- 
mizes overall benefit; that is, the objective is to maximize the quantity, 

'^Hy +'^Uy + Be . 

V&S v^S ee{S,S) 



An integer programming formulation of this problem was presented in [HJ92]; 
Hochbaum and Pathria [HP97] provided a polynomial time solution for instances 
in which the underlying graph is bipartite (which turns out to be a monotone 
IP2). The formulation as integer programming is not useful for approximation 
purposes, only for optimization, as it contains constant terms. 

We now show that this forest harvesting problem can be directly modeled as 
an instance of MAX GEN2SAT, and are thereby able to provide an approxima- 
tion algorithm for all instances (including “non-bipartite” cell structures) of the 
problem. 

For a given instance of the forest harvesting problem, construct an instance 
of MAX GEN2SAT as follows: 







10 



Dorit S. Hochbaum 



— Correspond a variable Xi with each cell i that indicates whether or not the 
cell is harvested. 

— For each cell i create a genclause equivalent to the expression Xi (form of 
either genclause 7 or 8) of weight Hi. Also for each cell i create a genclause 
equivalent to the expression Xi (form of either genclause 3 or 4) of weight 
U,. 

— For each edge e = {t,j} create a genclause equivalent to the expression 
{xi 0 Xj) (form of genclause 6) with weight Be- 

Now, each harvesting decision has a 1:1 correspondence with an assignment 
of variables satisfying genclauses of equivalent weight in the MAX GEN2SAT 
expression. Because all genclauses in the MAX GEN2SAT instance are of Type 
1, it follows from Theorem 2 that we can find a solution within (a — e) of the 
optimal. Thus, while it was shown in [HP97] that the forest harvesting problem 
could be solved optimally when the underlying graph was bipartite, we have now 
established an approximation algorithm for general instances of the problem. 

Theorem 3. The forest harvesting problem can he approximated within a factor 
of {a — e) in polynomial time. 

Another version of the problem has different benefit associated with har- 
vesting cell i but not j, Bij, than harvesting cell j but not i, Bji. This can be 
considered to be a directed/ asymmetric version of the problem with “arc effects” 
substituting edge effects. In the objective function of this asymmetric forest har- 
vesting problem, the term X^eG(SS) substituted by j)g(SS) Here 

the clauses 14 and 15 substitute clause 6. This lead to an approximation within 
a factor of (/3 — e) in polynomial time. 

Theorem 4. The asymmetric forest harvesting problem can he approximated 
within a factor of {(3 — e) in polynomial time. 

8.2 Identifying Polynomial Time Solvability 

It is trivial to identify whether an IP2 problem is monotone and thus polynomial 
time solvable. We conclude with an illustration of how this recognition takes 
place for two examples where the polynomial time solvability is far from evident 
given the problem statement. 

Consider a problem of selection of cells in a region where the selection of 
each cell has a benefit or cost associated with it. There is a penalty for having 
two adjacent cells that have different status - namely, one that is selected and 
an adjacent one that is not selected. The aim is to minimize the net total cell 
selection cost and penalty costs. Assuming that the penalty costs are fairly uni- 
form, the solution would tend to be a subregion with as small a boundary as 
possible among regions with equivalent net benefit. 

We let the cells of the region correspond to the set of vertices of a graph, 
V, and two vertices are adjacent if and only if the corresponding two cells are 
adjacent. Let the cost of having two adjacent cells, one selected and one not, be 
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Cij. Let Wi be the cost/benefit of selecting cell i where a benefit is interpreted 
as a negative cost. The problem’s formulation is a monotone integer program: 



Min 

(Cell Selection) 



'l2jev + J2(i,j)eE 

Xi-Xj<zlf {i,j)GE 
Xi, binary for all i, j. 



CijZ 



( 2 ) 



The formulation is valid since at most one of the variables can be equal 

to 1 in an optimal solution. Since the formulation is monotone we conclude 
immediately that the problem is solvable in polynomial time in integers and 
that the solution can be derived by applying a minimum cut procedure on an 
associated graph. 

Another application is modeled as the Generalized Independent Set problem. 
In the independent set problem we seek a set of nodes of maximum total weight 
so that no two are adjacent. In the Generalized Independent Set problem it is 
permitted to have adjacent nodes in the set, but at a penalty that may be positive 
or negative. The independent set problem is the special case where the penalties 
are infinite. Therefore the Generalized Independent Set problem is NP-hard. 

An application of the Generalized Independent Set problem comes up in the 
context of locating postal service offices, [Ba92]. Each potential location of the 
service has value associated with it. The value, however, is diminished when sev- 
eral facilities that are close enough to compete for the same customers. Following 
the principle of inclusion-exclusion, the second order approximation of that loss 
is represented in pairwise interaction cost for every pair of potential facilities. 

The postal service problem is defined on a complete graph G = {V, E) where 
the pairwise interaction cost, Cij, is assigned to every respective edge (i, j). The 
formulation of the Generalized Independent Set problem that models all these 
problems is: 



Max '^(i,j)£E^ij^ij 

(Gen-Ind-Set) subject to Xi + Xj < 1 + Zij {i,j) G E 

Xi, Zij binary for all i,j. 

When the underlying graph for Generalized Independent Set is bipartite 
then the problem is recognized as solvable in polynomial time [HP97]. Indeed, 
Hochbaum and Pathria found that Generalized Independent Set on bipartite 
graphs is the model for a forest harvesting problem where the regions of the 
forest frequently form a grid-like structure. This forest harvesting problems is 
thus solved in polynomial time. 
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Abstract. Deterministic models for project scheduling and control snf- 
fer from the fact that they assnme complete information and neglect 
random influences that occur during project execntion. A typical con- 
sequence is the underestimation of the expected project dnration and 
cost frequently observed in practice. To cope with these phenomena, we 
consider scheduling models in which processing times are random but 
precedence and resource constraints are fixed. Scheduling is done by poli- 
cies which consist of an an online process of decisions that are based on 
the observed past and the a priori knowledge of the distribntion of pro- 
cessing times. We give an informal survey on different classes of policies 
and show that suitable combinatorial properties of such policies give in- 
sights into optimality, computational methods, and their approximation 
behavior. In particular, we present recent constant-factor approximation 
algorithms for simple policies in machine scheduling that are based on a 
snitable polyhedral relaxation of the performance space of policies. 



1 Uncertainty in Scheduling 

In real-life projects, it usually does not suffice to find good schedules for fixed 
deterministic processing times, since these times mostly are only rough estimates 
and subject to unpredictable changes due to unforeseen events such as weather 
conditions, obstruction of resource usage, delay of jobs and others. 

In order to model such influences, the processing time of a job j G V is 
assumed to be a random variable Pj. Then p = (pi,p2, . . . ,p„) denotes the 
(random) vector of processing times, which is distributed according to a joint 
probability distribution Q. This distribution Q is assumed to be known and may 
also contain stochastic dependencies. Furthermore, like in deterministic models, 
we have precedence constraints given by a directed acyclic graph G = (V, E) 
and resource constraints. In the classification scheme of [1], these problems are 
denoted by PS \ prec, pj = sto \ k, where k, is the objective (e.g. the project 
makespan Cmax or the sum of weighted completion times ^ WjCj). 

* Supported by Deutsche Forschungsgemeinschaft under grant Mo 346/3-3 and by 
German Israeli Fonndation under grant 1-564-246.06/97. 
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The necessity to deal with uncertainty in project planning becomes obvi- 
ous if one compares the “deterministic makespan” Cmax(^^(Pi), ■ ■ • ,E{pn)) ob- 
tained from the expected processing times E(j3j) with the expected makespan 
-E(C'max(p))- Even in the absence of resource constraints, there is a systematic 
underestimation C'max(E(pi), . . . , E(p„)) < E(C'max(Pi, • ■ • ,Pn)) which maybe- 
come arbitrarily large with increasing number of jobs or increasing variances of 
the processing times [7]. Equality holds if and only if there is one path that 
is the longest with probability 1. This systematic underestimation of the ex- 
pected makespan has already been observed by Fulkerson [2] . The error becomes 
even worse if one compares the deterministic value C'niax(E(pi), . . . , E(p„)) with 
quantiles tq such that -Pro6{Cmax(p) < tq} > q for large values of q (say q = 0.9 
or 0.95). 

A simple example is given in Figure 1 for a project with n parallel jobs 
that are independent and uniformly distributed on [0,2]. Then the deterministic 
makespan C'max(E(pi), . . . , E(p„)) = 1, while Prob{Cmax < 1) — > 0 for n ^ oo. 
Similarly, all quantiles ^ 2 for n ^ oo (and g > 0). 

This is the reason why good practical planning tools should incorporate 
stochastic methods. 




Fig. 1. Distribution function of the makespan for n = 1, 2, 4, 8 parallel jobs that 
are independent and uniformly distributed on [0,2]. 



2 Planning with Policies 

If the problem involves only precedence constraints, every job can be scheduled 
at its earliest start, i.e., when its last predecessor completes. This is no longer 
possible when resource constraints are present. Planning is then done by policies 
or strategies that dynamically make scheduling decisions based on the observed 
past and the a priori knowledge about the processing time distributions. This can 
be seen as a special stochastic dynamic optimization problem or as an online 
algorithm against a “randomizing” adversary who draws job processing times 
according to a known distribution. 

This model is somewhat related to certain online scenarios, which recently 
have received quite some attention. These scenarios are also based on the as- 
sumption that the scheduler does not have access to the whole instance at once. 
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but rather learns the input piece by piece over time and has to make decisions 
based on partial knowledge only. When carried to an extreme, there is both a 
lack of knowledge on jobs arriving in the future and the running time of every 
job is unknown until it completes. In these models, online algorithms are usually 
analyzed with respect to optimum off-line solutions, whereas here we compare 
ourselves with the best possible policy which is subject to the same uncertainty. 
Note that our model is also more moderate than online scheduling in the sense 
that the number of jobs to be scheduled as well as their joint processing time 
distribution are known in advance. We refer to [16] for an overview on the other 
online scheduling models. 

Our model with random processing times has been studied in machine schedul- 
ing, but much less in project scheduling. The survey [9] has stayed representative 
for most of the work until the mid 90ties. 

A policy n takes actions at decision points, which are t = 0 (project start), 
job completions, and tentative decision times where information becomes avail- 
able. An action at time t consists in choosing a, feasible set of jobs to be started at 
t, where feasible means that precedence and resource constraints are respected, 
and in choosing the next tentative decision time ^pianned^ actual next decision 
time is the minimum of and the first job completion after t. 

The decision which action to take may of course only exploit information 
of the past up to time t and the given distribution Q {non-anticipative char- 
acter of policies). After every job has been scheduled, we have a realization 
p = (pi, . . . ,pn) of processing times and II has constructed a schedule II[p] = 
(S\, 82, - , Sn) of starting times Sj for the jobs j. If K^{p) denotes the “cost” of 

that schedule, and E{k^ ( p)) the expected cost under policy 77, the aim then is 
to find a policy that minimizes the expected cost (e.g., the expected makespan). 






min£(Cn„„j) 







= (I + e,4,4,8,4) with probability - 
y =(1, 4, 4, 4, 8) with probability i 




Observe job 1 
at tentative 
decision time 

^planned _ 



3 



5 



y 






2 



4 



3 



5 



Fig. 2. Optimal policies may involve tentative decision times. 



As an example, consider the problem in Figure 2. The precedence constraints 
are given by the digraph in the upper left corner. The two encircled jobs 2 and 3 
compete for the same scarce resource and may not be scheduled simultaneously. 
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There are two possible realizations x‘^ and y that occur with probability ^ each. 
The aim is to minimize the expected makespan. If one starts jobs 1 and 2, say, 
at time 0, one achieves only an expected makespan of 14. Observing job 1 at 
the tentative decision time = 1 yields 13 instead. The example shows in 

particular that jobs may start at times where no other job ends. 

In general, there need not exist an optimal policy. This can only be guar- 
anteed under assumptions on the cost function k (e.g. continuity) or the dis- 
tribution Q (e.g. finite discrete or with a Lebesgue density), see [9] for more 
details. 

Stability issues constitute an important reason for considering only restricted 
classes of policies. Data deficiencies and the use of approximate methods (e.g. 
simulation) require that the optimum expected cost OPT(k, Q) for an “approx- 
imate” cost function R and distribution Q is “close” to the optimum expected 
cost OPT{k, Q) when R and Q are “close” to k and Q, respectively. (This can be 
made precise by considering uniform convergence R k of cost functions and 
weak convergence Q Q of probability distributions.) 

Unfortunately, the class of all policies is unstable. The above example illus- 
trates why. Consider Q® as an approximation to Q = lirng^oQ*^- For Q, one is 
no longer able to obtain information by observing job 1, and thus only achieves 
an average makespan of 14. Figure 3 illustrates this. 



e ^ 0 => Q ^ Q with Q : 

No info when 1 completes. So start 2 at t = 0 



X = (1,4, 4, 8, 4) with probability 
y = (1,4, 4, 4, 8) with probability 



1 

2 

1 

2 




Fig. 3. The class of all policies is unstable. 



The main reason for this instability is the fact that policies may use small, 
almost “not observable” pieces of information for their decision. This can be 
overcome by restricting to robust policies. These are policies that start jobs only 
at completion of other jobs (no tentative decision times) and use only “robust” 
information from the past, viz. only the fact whether a job is completed, busy, 
or not yet started. 
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3 Robust Classes of Policies 

Policies may be classified by their way of resolving the “essential” resource con- 
flicts. These conflicts can be modeled by looking at the set T of forbidden sets of 
jobs. Every proper subset F' C F of such a set F S IF can in principle be sched- 
uled simultaneously, but the set F itself cannot because of the limited resources. 
In the above example, F = {{2, 3}}. For a scheduling problem with m identical 
machines, T consists of all (m -I- l)-element independent sets of the digraph G 
of precedence constraints. 

3.1 Priority Policies 

A well-known class of robust policies is the class of priority policies. They settle 
the resource conflicts by a priority list L, i.e., at every decision time, they start as 
many jobs as possible in the order of L. Though simple and easy to implement, 
they exhibit a rather unsatisfactory stability behavior (Graham anomalies). Let 
us view a policy 7T as a function 77 : M” ^ K"' that maps every vector p = 
(pi, . . . ,pn) of processing times to a schedule U[p] = (5i, S' 2 , . . . , S'n) of starting 
times Sj for the jobs j . In this interpretation as a function, priority policies are 
in general neither continuous nor monotone. This is illustrated in Figure 4 on 
one of Graham’s examples [5]. When p changes continuously and monotonously 
from y into x, 77 [p] 7 = S-j jumps discontinuously and 77 [p] 5 = decreases while 
p grows. 



2 identical machines 
min C„,,„ 

7 = 1<2<...<7 



1 4 6 

2 I 3 ~ 5 7 

y = x- 1 =(3,1, 1,4, 4, 9, 9) 

1 I 4 
T]~3 I 7 

Fig. 4. Priority policies are neither continuous nor monotone. 





3.2 Preselective Policies 

Policies with a much better stability behavior are the preselective policies intro- 
duced in [8] . They solve every resource conflict given by a forbidden set F G F 
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by choosing a priori a waiting job jp G F that can only start after at least one 
job j G F \ {jp} has completed. This defines a disjunctive waiting condition 
{F\jp,jp) for every forbidden set F G F. A preselective policy then does early 
start scheduling w.r.t. the precedence constraints and the system W of waiting 
conditions obtained from choosing waiting jobs for the forbidden sets. The same 
idea is known in deterministic scheduling as delaying alternatives, see e. g. [1, 
Section 3.1]. 

A very useful combinatorial model for preselective policies has been intro- 
duced in [13]. Waiting conditions and ordinary precedence constraints are mod- 
eled by an and/or graph that contains AND-nodes for the ordinary precedence 
constraints and OR-nodes for the disjunctive precedence constraints. Figure 5 
shows how. 




select 4 for both AND/OR network representing the 

forbidden sets preselective policy 

Fig. 5. and/or graph induced by a preselective policy. 



Since preselective policies do early start scheduling, it follows that the start 
time of a job j is the minimum length of a longest path to node j in the AND/or 
graph, where the minimum is taken over the different alternatives in the OR- 
nodes. As a consequence, preselective policies are continuous and monotone (in 
the function interpretation) and thus avoid the Graham anomalies (see Figure 6). 
Surprisingly, also the reverse is true, i.e. every continuous robust policy is prese- 
lective and every monotone policy II is dominated by a preselective policy, i.e., 
there is a preselective policy II' with II' < II [15]. This implies in particular 
that Graham anomalies come in pairs (discontinuous and not monotone). 

There are several interesting and natural questions related to and/or graphs 
(or preselective policies). One is feasibility, since and/or graph may contain 
cycles. Here feasibility means that all jobs j G V can be arranged is a linear list 
L such that all waiting conditions given by the and/or graph are satisfied (all 
AND predecessors of a job j occur before j in L, and at least one OR predecessor 
occurs before j in L). Another question is transitivity, i.e., is a new waiting 
condition “j waits for at least one job from V' C V ” implied by the given ones? 
Or transitive reduction, i.e., is there a unique “minimal” and/or graph that is 
equivalent to a given one (in the sense that they admit the same linear lists)? 
All these questions have been addressed in [13]. Feasibility can be detected in 
linear time. A variant of feasibility checking computes transitively forced waiting 
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2 identical machines 

so F = {4,5,7) is the only forbidden set 

1 

7 waiting job 



x = (4,2,2,5,5,10,10) 
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2 3 


5 


7 



y = x-l =(3,1,1,4,4,9,9) 
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4 


6 


2 3 




5 


7 



Fig. 6. A preselective policy for Graham’s example. 



conditions, and there is a unique minimal representation that can be constructed 
in polynomial time. 

The and/or graph is also useful for several computational tasks related to 
preselective policies. 

First, it provides the right structure to compute the start times n[p] of a 
policy n for a given realization p of processing times. This can be done by a 
Dijkstra-like algorithm since all processing times pj are positive, see [12]. For 
general arc weights, the complexity status of computing earliest start times in 
and/or graphs is open. The corresponding decision problem “Is the earliest start 
of node v at most t ?” is in NP n coNP and only pseudopolynomial algorithms 
are known to compute the earliest start times. This problem is closely related to 
mean-payoff games considered e.g. in [19]. This and other relationships as well as 
more applications of and/or graphs (such as disassembly in scheduling [4]) are 
discussed in [12]. [12] also derives a polynomial algorithm to compute the earliest 
start times when all arc weights are non-negative. This is already a non-trivial 
task that requires checking for certain 2-connected subgraphs with arc weights 
0, which can be done by a variant of the feasibility checking algorithm. 

Second, it can be used to detect implied waiting conditions, which is useful 
if “good” or optimal preselective policies are constructed in a branch and bound 
approach. There, branching is done on the possible choices of waiting jobs for 
a forbidden set F as demonstrated in Figure 7. The algorithm for detecting 
implied waiting conditions can then be used to check if the forbidden set F of 
the current tree node N has already an implicit waiting job that is implied by 
the earlier choices of waiting jobs in ancestors of N . 

This also provides a criterion for dominance shown in [14]: A preselective 
policy is dominated iff no forbidden set F has a transitively implied waiting job 
that is different from the chosen waiting job. 
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Fig. 7. Computing preselective policies by branch and bound. 



3.3 Special Preselective Policies 

There are several interesting subclasses of preselective policies. 

For instance, one may in addition to the waiting job jp from a forbidden 
set F also choose the job ip G F for which j must wait. That means that the 
conflict given by F is solved by introducing the additional precedence constraint 
ip < jp- Then the and/or graph degenerates into an AND graph, i.e., consists 
like G of ordinary precedence constraints. Feasibility of and/or graphs then 
reduces to being acyclic and early start scheduling can be done by longest path 
calculations. This class of policies is known as earliest start policies [8]. They are 
convex functions and every convex policy is already an earliest start policy [15]. 

Another class, the class of linear preselective policies has been introduced in 
[14] . It is motivated by the precedence tree concept used for PS \ prec \ Cmax, see 
e. g. [1, Section 3.1]), and combines the advantages of preselective policies and 
priority rules. Such a policy FI uses a priority list L (that is a topological sort 
of the graph G of precedence constraints) and chooses the waiting jobs jp G F 
as the last job from F in L. It follows that a preselective policy is linear iff the 
corresponding and/or graph is acyclic. This class of policies possesses many 
favorable properties regarding domination, stability, computational effectiveness, 
and solution quality. Computational evidence is given in [17]. An example is 
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presented in Figure 8. Here “Nodes” refers to the number of nodes considered 
in the branch and bound tree. 

Linear preselective policies are related to job-based priority rules known from 
deterministic scheduling. These behave like priority policies, but obey the addi- 
tional constraint that the start times Sj preserve the order given by the priority 
list L, i.e. Si < Sj if i precedes j in L. Every job based priority rule is linear 
preselective [14] but may be dominated by a better linear preselective policy. 
The advantage of job-based priority policies lies in the fact that they do not 
explicitly need to know the (possibly large) system !F of forbidden sets. They 
settle the conflicts online by the condition on the start times. 



Truncated Erlang distribution on [0.2*mean; 2.6*mean] 




Optimum deterministic makespan 


203 


CPU: 


.17 sec 


Optimum expected makespan 


243.2 






Optimal preselective policy 


Nodes: 115007 


CPU: 


3772.01 sec 


Opt. linear preselective policy 


Nodes; 4209 


CPU: 


49.85 sec 



Fig. 8. Computational results for linear preselective policies. 



3.4 General Robust Policies 

The class of all robust policies has been studied in [10] under the name set 
policies (as the decision at a decision time t is only based on the knowledge of 
the set of completed jobs and the set of busy jobs). 

These policies behave locally like earliest start policies, i.e., for every robust 
policy 7T, there is a partition of IR" into finitely many polyhedral cones such that, 
locally on each cone, U is an earliest start policy and thus convex, continuous and 
monotone. This shows that Graham anomalies can only occur at the boundaries 
of these cones. 

It turns out that for problems with independent exponential processing time 
distributions and “additive” cost functions, there is an optimal policy that is 
robust. 

Here additive means that there is a set function g ■. 2^ (the cost rate) 

such that k(Ci, . . . , C„) = / g{U{t))dt, where U{t) denotes the set of jobs that 
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are still uncompleted at time t. Special cases are n = Cmax, where g(fi) := 0 and 
g{U) = 1 otherwise, and k = '^wjCj, where g{U) = 

In more special cases (no precedence constraints, m identical parallel ma- 
chines) there may even be optimal policies that are priority policies (again 
for independent exponential processing time distributions). If k = Cmax, then 
LEFT (longest expected processing time first) is known to be optimal, while for 
K = ^Cj, SEPT (longest expected processing time first) is optimal [18]. It is an 
open problem if there is also an optimal priority policy for k = X) 



4 How Good Are Simple Policies? 

Compared to its deterministic counterpart, only little is known about the ap- 
proximation behavior of (simple) policies for arbitrary processing time distribu- 
tions. A first step into this direction is taken in [11] for the problem of min- 
imizing the average weighted completion on identical parallel machines, i.e., 
P\rj, Pj = sto I J2wjCj. 

This approach is based on a suitable polyhedral relaxation of the “perfor- 
mance space” S = {{E{Ci ), . . . , E{C^)) \ II policy} of all vectors of expected 
completion times achieved by any policy. The optimal solution of an LP over this 
polyhedral relaxation is then used to construct priority and linear preselective 
policies, and these are shown to have constant factor performance guarantees, 
even in the presence of release dates. This generalizes several previous results 
from deterministic scheduling and also yields a worst case performance guarantee 
for the well known WSEPT heuristic. 

We will illustrate this in the simplest case P \ pj = sto \ ^ WjCj and refer 
to [11] for more information and also related work on the optimal control of 
stochastic systems [3] . 

A policy is called an a- approximation if its expected cost is always within a 
factor of a of the optimum value, and if it can be determined and executed in 
polynomial time with respect to the input size of the problem. To cope with the 
input size of a stochastic scheduling problem, which includes non-discrete data in 
general, we assume that the input is specified by the number of jobs, the number 
of machines, and the encoding lengths of weights Wj, release dates Vj, expected 
processing times A[pj], and, as the sole stochastic information, an upper bound 
A on the coefficients of variation of all processing time distributions Pj, j = 
1, . . . ,n. The coefficient of variation of a given random variable X is the ratio 
^JVa,^:[X]/ E[X]. Thus, it is particularly sufficient if all second moments A[p|] are 
given. This notion of input size is motivated by the fact that from a practitioner’s 
point of view the expected processing times of jobs together with the assumption 
of some typical distribution “around them” is realistic and usually suffices to 
describe a stochastic scheduling problem. Note, however, that the performance 
guarantees obtained actually hold with respect to optimal policies that make use 
of the complete knowledge of the distributions of processing times. 

The polyhedral relaxation V is derived by a pointwise argument from known 
valid inequalities in completion time variables for the deterministic case [6]. 
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Besides E{Cj) > E{pj) the crucial valid inequalities are 



Y,E{p,)E{c,)>^({Y^E{p,)\ + Y.(E{p,)r 

j&A ^ jeA ' j&A 



m — 1 
2m 



^Var(pj) 

j&A 



for all A C V. 



They differ from the deterministic counterpart in the term involving the vari- 
ances of the processing times. With the upper bound A on the coefficients of 
variation they may be rewritten as 



jGA ^jeA ' j&A 



m — 1 (to — 1)(Z\ 
2 to 2 to 




for all A CV. 



The LP relaxation minj^^^y Cj \ C € V} can be solved in polynomial 
time by purely combinatorial methods in O(n^) time [11]. An optimal solution 
= (Cf^, . . . ,C^^) to this LP defines an ordering L of jobs according to 
nondecreasing values of . This list L is then used to define a priority policy 
or linear preselective policy for the original problem. 

If n denotes such a policy, clearly WjC^^ < OPT < WjE{Cj), 

and the goal is to prove ^ i foi' some a > 1. This 

leads to a performance guarantee of a for the policy II and also to a (dual) 
guarantee for the quality of the LP lower bound: ^j^y WjE{Cj) < a ■ OPT and 
T,^yW,CfP>iOPT. 

The performance guarantee thus obtained is a = 2 — A _|_ max{l, 
which may be improved to a = (1 -I- by the use of a specific priority 

policy (weighted expected processing time first). For problems with release dates, 
this priority policy can be arbitrarily bad, and the best guarantee is given by a 
job-based priority policy defined via the LP. The guarantees become stronger if 
Z\ < 1, which is the case for distributions that are NBUE (new better than used 
in expectation), which seems to be a reasonable class for applications. 

These are the first non-trivial approximation algorithms with constant perfor- 
mance guarantees for stochastic scheduling problems. It is an open problem how 
to derive such algorithms for stochastic problems with precedence constraints. 
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Abstract. Deterministic models for project scheduling and control snf- 
fer from the fact that they assnme complete information and neglect 
random influences that occur during project execntion. A typical con- 
sequence is the underestimation of the expected project dnration and 
cost frequently observed in practice. To cope with these phenomena, we 
consider scheduling models in which processing times are random but 
precedence and resource constraints are fixed. Scheduling is done by poli- 
cies which consist of an an online process of decisions that are based on 
the observed past and the a priori knowledge of the distribntion of pro- 
cessing times. We give an informal survey on different classes of policies 
and show that suitable combinatorial properties of such policies give in- 
sights into optimality, computational methods, and their approximation 
behavior. In particular, we present recent constant-factor approximation 
algorithms for simple policies in machine scheduling that are based on a 
snitable polyhedral relaxation of the performance space of policies. 



1 Uncertainty in Scheduling 

In real-life projects, it usually does not suffice to find good schedules for fixed 
deterministic processing times, since these times mostly are only rough estimates 
and subject to unpredictable changes due to unforeseen events such as weather 
conditions, obstruction of resource usage, delay of jobs and others. 

In order to model such influences, the processing time of a job j G V is 
assumed to be a random variable Pj. Then p = (pi,p2, . . . ,p„) denotes the 
(random) vector of processing times, which is distributed according to a joint 
probability distribution Q. This distribution Q is assumed to be known and may 
also contain stochastic dependencies. Furthermore, like in deterministic models, 
we have precedence constraints given by a directed acyclic graph G = (V, E) 
and resource constraints. In the classification scheme of [1], these problems are 
denoted by PS \ prec, pj = sto \ k, where k, is the objective (e.g. the project 
makespan Cmax or the sum of weighted completion times ^ WjCj). 

* Supported by Deutsche Forschungsgemeinschaft under grant Mo 346/3-3 and by 
German Israeli Fonndation under grant 1-564-246.06/97. 
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The necessity to deal with uncertainty in project planning becomes obvi- 
ous if one compares the “deterministic makespan” Cmax(^^(Pi), ■ ■ • ,E{pn)) ob- 
tained from the expected processing times E(j3j) with the expected makespan 
-E(C'max(p))- Even in the absence of resource constraints, there is a systematic 
underestimation C'max(E(pi), . . . , E(p„)) < E(C'max(Pi, • ■ • ,Pn)) which maybe- 
come arbitrarily large with increasing number of jobs or increasing variances of 
the processing times [7]. Equality holds if and only if there is one path that 
is the longest with probability 1. This systematic underestimation of the ex- 
pected makespan has already been observed by Fulkerson [2] . The error becomes 
even worse if one compares the deterministic value C'niax(E(pi), . . . , E(p„)) with 
quantiles tq such that -Pro6{Cmax(p) < tq} > q for large values of q (say q = 0.9 
or 0.95). 

A simple example is given in Figure 1 for a project with n parallel jobs 
that are independent and uniformly distributed on [0,2]. Then the deterministic 
makespan C'max(E(pi), . . . , E(p„)) = 1, while Prob{Cmax < 1) — > 0 for n ^ oo. 
Similarly, all quantiles ^ 2 for n ^ oo (and g > 0). 

This is the reason why good practical planning tools should incorporate 
stochastic methods. 




Fig. 1. Distribution function of the makespan for n = 1, 2, 4, 8 parallel jobs that 
are independent and uniformly distributed on [0,2]. 



2 Planning with Policies 

If the problem involves only precedence constraints, every job can be scheduled 
at its earliest start, i.e., when its last predecessor completes. This is no longer 
possible when resource constraints are present. Planning is then done by policies 
or strategies that dynamically make scheduling decisions based on the observed 
past and the a priori knowledge about the processing time distributions. This can 
be seen as a special stochastic dynamic optimization problem or as an online 
algorithm against a “randomizing” adversary who draws job processing times 
according to a known distribution. 

This model is somewhat related to certain online scenarios, which recently 
have received quite some attention. These scenarios are also based on the as- 
sumption that the scheduler does not have access to the whole instance at once. 
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but rather learns the input piece by piece over time and has to make decisions 
based on partial knowledge only. When carried to an extreme, there is both a 
lack of knowledge on jobs arriving in the future and the running time of every 
job is unknown until it completes. In these models, online algorithms are usually 
analyzed with respect to optimum off-line solutions, whereas here we compare 
ourselves with the best possible policy which is subject to the same uncertainty. 
Note that our model is also more moderate than online scheduling in the sense 
that the number of jobs to be scheduled as well as their joint processing time 
distribution are known in advance. We refer to [16] for an overview on the other 
online scheduling models. 

Our model with random processing times has been studied in machine schedul- 
ing, but much less in project scheduling. The survey [9] has stayed representative 
for most of the work until the mid 90ties. 

A policy n takes actions at decision points, which are t = 0 (project start), 
job completions, and tentative decision times where information becomes avail- 
able. An action at time t consists in choosing a, feasible set of jobs to be started at 
t, where feasible means that precedence and resource constraints are respected, 
and in choosing the next tentative decision time ^pianned^ actual next decision 
time is the minimum of and the first job completion after t. 

The decision which action to take may of course only exploit information 
of the past up to time t and the given distribution Q {non-anticipative char- 
acter of policies). After every job has been scheduled, we have a realization 
p = (pi, . . . ,pn) of processing times and II has constructed a schedule II[p] = 
(S\, 82, - , Sn) of starting times Sj for the jobs j. If K^{p) denotes the “cost” of 

that schedule, and E{k^ ( p)) the expected cost under policy 77, the aim then is 
to find a policy that minimizes the expected cost (e.g., the expected makespan). 






min£(Cn„„j) 







= (I + e,4,4,8,4) with probability - 
y =(1, 4, 4, 4, 8) with probability i 




Observe job 1 
at tentative 
decision time 

^planned _ 



3 



5 



y 






2 



4 



3 



5 



Fig. 2. Optimal policies may involve tentative decision times. 



As an example, consider the problem in Figure 2. The precedence constraints 
are given by the digraph in the upper left corner. The two encircled jobs 2 and 3 
compete for the same scarce resource and may not be scheduled simultaneously. 
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There are two possible realizations x‘^ and y that occur with probability ^ each. 
The aim is to minimize the expected makespan. If one starts jobs 1 and 2, say, 
at time 0, one achieves only an expected makespan of 14. Observing job 1 at 
the tentative decision time = 1 yields 13 instead. The example shows in 

particular that jobs may start at times where no other job ends. 

In general, there need not exist an optimal policy. This can only be guar- 
anteed under assumptions on the cost function k (e.g. continuity) or the dis- 
tribution Q (e.g. finite discrete or with a Lebesgue density), see [9] for more 
details. 

Stability issues constitute an important reason for considering only restricted 
classes of policies. Data deficiencies and the use of approximate methods (e.g. 
simulation) require that the optimum expected cost OPT(k, Q) for an “approx- 
imate” cost function R and distribution Q is “close” to the optimum expected 
cost OPT{k, Q) when R and Q are “close” to k and Q, respectively. (This can be 
made precise by considering uniform convergence R k of cost functions and 
weak convergence Q Q of probability distributions.) 

Unfortunately, the class of all policies is unstable. The above example illus- 
trates why. Consider Q® as an approximation to Q = lirng^oQ*^- For Q, one is 
no longer able to obtain information by observing job 1, and thus only achieves 
an average makespan of 14. Figure 3 illustrates this. 



e ^ 0 => Q ^ Q with Q : 

No info when 1 completes. So start 2 at t = 0 



X = (1,4, 4, 8, 4) with probability 
y = (1,4, 4, 4, 8) with probability 



1 

2 

1 

2 




Fig. 3. The class of all policies is unstable. 



The main reason for this instability is the fact that policies may use small, 
almost “not observable” pieces of information for their decision. This can be 
overcome by restricting to robust policies. These are policies that start jobs only 
at completion of other jobs (no tentative decision times) and use only “robust” 
information from the past, viz. only the fact whether a job is completed, busy, 
or not yet started. 
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3 Robust Classes of Policies 

Policies may be classified by their way of resolving the “essential” resource con- 
flicts. These conflicts can be modeled by looking at the set T of forbidden sets of 
jobs. Every proper subset F' C F of such a set F S IF can in principle be sched- 
uled simultaneously, but the set F itself cannot because of the limited resources. 
In the above example, F = {{2, 3}}. For a scheduling problem with m identical 
machines, T consists of all (m -I- l)-element independent sets of the digraph G 
of precedence constraints. 

3.1 Priority Policies 

A well-known class of robust policies is the class of priority policies. They settle 
the resource conflicts by a priority list L, i.e., at every decision time, they start as 
many jobs as possible in the order of L. Though simple and easy to implement, 
they exhibit a rather unsatisfactory stability behavior (Graham anomalies). Let 
us view a policy 7T as a function 77 : M” ^ K"' that maps every vector p = 
(pi, . . . ,pn) of processing times to a schedule U[p] = (5i, S' 2 , . . . , S'n) of starting 
times Sj for the jobs j . In this interpretation as a function, priority policies are 
in general neither continuous nor monotone. This is illustrated in Figure 4 on 
one of Graham’s examples [5]. When p changes continuously and monotonously 
from y into x, 77 [p] 7 = S-j jumps discontinuously and 77 [p] 5 = decreases while 
p grows. 



2 identical machines 
min C„,,„ 

7 = 1<2<...<7 



1 4 6 

2 I 3 ~ 5 7 

y = x- 1 =(3,1, 1,4, 4, 9, 9) 

1 I 4 
T]~3 I 7 

Fig. 4. Priority policies are neither continuous nor monotone. 





3.2 Preselective Policies 

Policies with a much better stability behavior are the preselective policies intro- 
duced in [8] . They solve every resource conflict given by a forbidden set F G F 
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by choosing a priori a waiting job jp G F that can only start after at least one 
job j G F \ {jp} has completed. This defines a disjunctive waiting condition 
{F\jp,jp) for every forbidden set F G F. A preselective policy then does early 
start scheduling w.r.t. the precedence constraints and the system W of waiting 
conditions obtained from choosing waiting jobs for the forbidden sets. The same 
idea is known in deterministic scheduling as delaying alternatives, see e. g. [1, 
Section 3.1]. 

A very useful combinatorial model for preselective policies has been intro- 
duced in [13]. Waiting conditions and ordinary precedence constraints are mod- 
eled by an and/or graph that contains AND-nodes for the ordinary precedence 
constraints and OR-nodes for the disjunctive precedence constraints. Figure 5 
shows how. 




select 4 for both AND/OR network representing the 

forbidden sets preselective policy 

Fig. 5. and/or graph induced by a preselective policy. 



Since preselective policies do early start scheduling, it follows that the start 
time of a job j is the minimum length of a longest path to node j in the AND/or 
graph, where the minimum is taken over the different alternatives in the OR- 
nodes. As a consequence, preselective policies are continuous and monotone (in 
the function interpretation) and thus avoid the Graham anomalies (see Figure 6). 
Surprisingly, also the reverse is true, i.e. every continuous robust policy is prese- 
lective and every monotone policy II is dominated by a preselective policy, i.e., 
there is a preselective policy II' with II' < II [15]. This implies in particular 
that Graham anomalies come in pairs (discontinuous and not monotone). 

There are several interesting and natural questions related to and/or graphs 
(or preselective policies). One is feasibility, since and/or graph may contain 
cycles. Here feasibility means that all jobs j G V can be arranged is a linear list 
L such that all waiting conditions given by the and/or graph are satisfied (all 
AND predecessors of a job j occur before j in L, and at least one OR predecessor 
occurs before j in L). Another question is transitivity, i.e., is a new waiting 
condition “j waits for at least one job from V' C V ” implied by the given ones? 
Or transitive reduction, i.e., is there a unique “minimal” and/or graph that is 
equivalent to a given one (in the sense that they admit the same linear lists)? 
All these questions have been addressed in [13]. Feasibility can be detected in 
linear time. A variant of feasibility checking computes transitively forced waiting 
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2 identical machines 

so F = {4,5,7) is the only forbidden set 
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7 waiting job 
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y = x-l =(3,1,1,4,4,9,9) 
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Fig. 6. A preselective policy for Graham’s example. 



conditions, and there is a unique minimal representation that can be constructed 
in polynomial time. 

The and/or graph is also useful for several computational tasks related to 
preselective policies. 

First, it provides the right structure to compute the start times n[p] of a 
policy n for a given realization p of processing times. This can be done by a 
Dijkstra-like algorithm since all processing times pj are positive, see [12]. For 
general arc weights, the complexity status of computing earliest start times in 
and/or graphs is open. The corresponding decision problem “Is the earliest start 
of node v at most t ?” is in NP n coNP and only pseudopolynomial algorithms 
are known to compute the earliest start times. This problem is closely related to 
mean-payoff games considered e.g. in [19]. This and other relationships as well as 
more applications of and/or graphs (such as disassembly in scheduling [4]) are 
discussed in [12]. [12] also derives a polynomial algorithm to compute the earliest 
start times when all arc weights are non-negative. This is already a non-trivial 
task that requires checking for certain 2-connected subgraphs with arc weights 
0, which can be done by a variant of the feasibility checking algorithm. 

Second, it can be used to detect implied waiting conditions, which is useful 
if “good” or optimal preselective policies are constructed in a branch and bound 
approach. There, branching is done on the possible choices of waiting jobs for 
a forbidden set F as demonstrated in Figure 7. The algorithm for detecting 
implied waiting conditions can then be used to check if the forbidden set F of 
the current tree node N has already an implicit waiting job that is implied by 
the earlier choices of waiting jobs in ancestors of N . 

This also provides a criterion for dominance shown in [14]: A preselective 
policy is dominated iff no forbidden set F has a transitively implied waiting job 
that is different from the chosen waiting job. 
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Fig. 7. Computing preselective policies by branch and bound. 



3.3 Special Preselective Policies 

There are several interesting subclasses of preselective policies. 

For instance, one may in addition to the waiting job jp from a forbidden 
set F also choose the job ip G F for which j must wait. That means that the 
conflict given by F is solved by introducing the additional precedence constraint 
ip < jp- Then the and/or graph degenerates into an AND graph, i.e., consists 
like G of ordinary precedence constraints. Feasibility of and/or graphs then 
reduces to being acyclic and early start scheduling can be done by longest path 
calculations. This class of policies is known as earliest start policies [8]. They are 
convex functions and every convex policy is already an earliest start policy [15]. 

Another class, the class of linear preselective policies has been introduced in 
[14] . It is motivated by the precedence tree concept used for PS \ prec \ Cmax, see 
e. g. [1, Section 3.1]), and combines the advantages of preselective policies and 
priority rules. Such a policy FI uses a priority list L (that is a topological sort 
of the graph G of precedence constraints) and chooses the waiting jobs jp G F 
as the last job from F in L. It follows that a preselective policy is linear iff the 
corresponding and/or graph is acyclic. This class of policies possesses many 
favorable properties regarding domination, stability, computational effectiveness, 
and solution quality. Computational evidence is given in [17]. An example is 
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presented in Figure 8. Here “Nodes” refers to the number of nodes considered 
in the branch and bound tree. 

Linear preselective policies are related to job-based priority rules known from 
deterministic scheduling. These behave like priority policies, but obey the addi- 
tional constraint that the start times Sj preserve the order given by the priority 
list L, i.e. Si < Sj if i precedes j in L. Every job based priority rule is linear 
preselective [14] but may be dominated by a better linear preselective policy. 
The advantage of job-based priority policies lies in the fact that they do not 
explicitly need to know the (possibly large) system !F of forbidden sets. They 
settle the conflicts online by the condition on the start times. 



Truncated Erlang distribution on [0.2*mean; 2.6*mean] 




Optimum deterministic makespan 


203 


CPU: 


.17 sec 


Optimum expected makespan 


243.2 






Optimal preselective policy 


Nodes: 115007 


CPU: 


3772.01 sec 


Opt. linear preselective policy 


Nodes; 4209 


CPU: 


49.85 sec 



Fig. 8. Computational results for linear preselective policies. 



3.4 General Robust Policies 

The class of all robust policies has been studied in [10] under the name set 
policies (as the decision at a decision time t is only based on the knowledge of 
the set of completed jobs and the set of busy jobs). 

These policies behave locally like earliest start policies, i.e., for every robust 
policy 7T, there is a partition of IR" into finitely many polyhedral cones such that, 
locally on each cone, U is an earliest start policy and thus convex, continuous and 
monotone. This shows that Graham anomalies can only occur at the boundaries 
of these cones. 

It turns out that for problems with independent exponential processing time 
distributions and “additive” cost functions, there is an optimal policy that is 
robust. 

Here additive means that there is a set function g ■. 2^ (the cost rate) 

such that k(Ci, . . . , C„) = / g{U{t))dt, where U{t) denotes the set of jobs that 
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are still uncompleted at time t. Special cases are n = Cmax, where g(fi) := 0 and 
g{U) = 1 otherwise, and k = '^wjCj, where g{U) = 

In more special cases (no precedence constraints, m identical parallel ma- 
chines) there may even be optimal policies that are priority policies (again 
for independent exponential processing time distributions). If k = Cmax, then 
LEFT (longest expected processing time first) is known to be optimal, while for 
K = ^Cj, SEPT (longest expected processing time first) is optimal [18]. It is an 
open problem if there is also an optimal priority policy for k = X) 



4 How Good Are Simple Policies? 

Compared to its deterministic counterpart, only little is known about the ap- 
proximation behavior of (simple) policies for arbitrary processing time distribu- 
tions. A first step into this direction is taken in [11] for the problem of min- 
imizing the average weighted completion on identical parallel machines, i.e., 
P\rj, Pj = sto I J2wjCj. 

This approach is based on a suitable polyhedral relaxation of the “perfor- 
mance space” S = {{E{Ci ), . . . , E{C^)) \ II policy} of all vectors of expected 
completion times achieved by any policy. The optimal solution of an LP over this 
polyhedral relaxation is then used to construct priority and linear preselective 
policies, and these are shown to have constant factor performance guarantees, 
even in the presence of release dates. This generalizes several previous results 
from deterministic scheduling and also yields a worst case performance guarantee 
for the well known WSEPT heuristic. 

We will illustrate this in the simplest case P \ pj = sto \ ^ WjCj and refer 
to [11] for more information and also related work on the optimal control of 
stochastic systems [3] . 

A policy is called an a- approximation if its expected cost is always within a 
factor of a of the optimum value, and if it can be determined and executed in 
polynomial time with respect to the input size of the problem. To cope with the 
input size of a stochastic scheduling problem, which includes non-discrete data in 
general, we assume that the input is specified by the number of jobs, the number 
of machines, and the encoding lengths of weights Wj, release dates Vj, expected 
processing times A[pj], and, as the sole stochastic information, an upper bound 
A on the coefficients of variation of all processing time distributions Pj, j = 
1, . . . ,n. The coefficient of variation of a given random variable X is the ratio 
^JVa,^:[X]/ E[X]. Thus, it is particularly sufficient if all second moments A[p|] are 
given. This notion of input size is motivated by the fact that from a practitioner’s 
point of view the expected processing times of jobs together with the assumption 
of some typical distribution “around them” is realistic and usually suffices to 
describe a stochastic scheduling problem. Note, however, that the performance 
guarantees obtained actually hold with respect to optimal policies that make use 
of the complete knowledge of the distributions of processing times. 

The polyhedral relaxation V is derived by a pointwise argument from known 
valid inequalities in completion time variables for the deterministic case [6]. 
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Besides E{Cj) > E{pj) the crucial valid inequalities are 



Y,E{p,)E{c,)>^({Y^E{p,)\ + Y.(E{p,)r 

j&A ^ jeA ' j&A 



m — 1 
2m 



^Var(pj) 

j&A 



for all A C V. 



They differ from the deterministic counterpart in the term involving the vari- 
ances of the processing times. With the upper bound A on the coefficients of 
variation they may be rewritten as 



jGA ^jeA ' j&A 



m — 1 (to — 1)(Z\ 
2 to 2 to 




for all A CV. 



The LP relaxation minj^^^y Cj \ C € V} can be solved in polynomial 
time by purely combinatorial methods in O(n^) time [11]. An optimal solution 
= (Cf^, . . . ,C^^) to this LP defines an ordering L of jobs according to 
nondecreasing values of . This list L is then used to define a priority policy 
or linear preselective policy for the original problem. 

If n denotes such a policy, clearly WjC^^ < OPT < WjE{Cj), 

and the goal is to prove ^ i foi' some a > 1. This 

leads to a performance guarantee of a for the policy II and also to a (dual) 
guarantee for the quality of the LP lower bound: ^j^y WjE{Cj) < a ■ OPT and 
T,^yW,CfP>iOPT. 

The performance guarantee thus obtained is a = 2 — A _|_ max{l, 
which may be improved to a = (1 -I- by the use of a specific priority 

policy (weighted expected processing time first). For problems with release dates, 
this priority policy can be arbitrarily bad, and the best guarantee is given by a 
job-based priority policy defined via the LP. The guarantees become stronger if 
Z\ < 1, which is the case for distributions that are NBUE (new better than used 
in expectation), which seems to be a reasonable class for applications. 

These are the first non-trivial approximation algorithms with constant perfor- 
mance guarantees for stochastic scheduling problems. It is an open problem how 
to derive such algorithms for stochastic problems with precedence constraints. 
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Abstract. One of the most flourishing areas of research in the design 
and analysis of approximation algorithms has been for facility location 
problems. In particular, for the metric case of two simple models, the 
uncapacitated facility location and the fc-median problems, there are 
now a variety of techniques that yield constant performance guarantees. 
These methods include LP rounding, primal-dual algorithms, and local 
search techniques. Furthermore, the salient ideas in these algorithms and 
their analyzes are simple-to-explain and reflect a surprising degree of 
commonality. This note is intended as companion to our lecture at CONF 
2000, mainly to give pointers to the appropriate references. 



1 A Tale of Two Problems 

In the past several years, there has been a steady series of developments in the 
design and analysis of approximation algorithms for two facility location prob- 
lems: the uncapacitated facility location problem, and the fc-median problem. 
Furthermore, although these two problems were always viewed as closely re- 
lated, some of this recent work has not only relied on their interrelationship, but 
also given new insights into the ways in which algorithms for the former problem 
yield algorithms, and performance guarantees, for the latter. 

In the k-median problem, the input consists of a parameter k, and n points in 
a metric space; that is, there is a set Af and for each pair of points i, j € Af, there 
is a given distance d{i,j) between them that is symmetric (i.e., d{i,j) = d{j,i), 
for each i,j G Af), satisfies the triangle inequality (i.e., d{i,j) + d{j, k) > d{i, k), 
for each i,j, k G Af), and also has the property that d{i, i) = 0 for each i & Af . 
The aim is to select k of the n points to be medians, and then assign each of the 
n input points to its closest median so as to minimize the average distance that 
an input point is from its assigned median. Early work on the fc-median problem 
was motivated by applications in facility location: each median corresponds to a 
facility to be built, and the input set of points corresponds to the set of clients 
that need to be serviced by these facilities; there are resources sufficient to build 
only k facilities, and one wishes to minimize the total cost of servicing the clients. 

In the uncapacitated facility location problem, which is also referred to as the 
simple plant location problem, the strict requirement that there be k facilities is 
relaxed, by introducing a cost associated with building a facility; these costs are 
then incorporated into the overall objective function. More precisely, the input 
consists of two sets of points (which need not be disjoint), the potential facility 
location points iF, and the set of clients C; in this case, we shall let n denote |lFU 
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C\. For each point i & T, there is a given cost fi that reflects the cost incurred in 
opening a facility at this location. We wish to decide which facilities to open so as 
to minimize the total cost for opening them, plus the total cost of servicing each 
of the clients from its closest open facility. The uncapacitated facility location 
problem is one of the most well-studied problems in the Operations Research 
literature, dating back to the work of Balinski [2], Kuehn and Hamburger [11], 
Manne [14], and Stollsteimer [18,19] in the early 60’s. 

Throughout this paper, a p- approximation algorithm is a polynomial-time 
algorithm that always finds a feasible solution with objective function value 
within a factor of p of optimal. Hochbaum [8] showed that the greedy algorithm 
is an 0(log n)-approximation algorithm for this problem, and provided instances 
to verify that this analysis is asymptotically tight. In fact, this result was shown 
for the more general setting, in which the input points need not belong to a 
metric space. 

Lin and Vitter [12] gave an elegant technique, called filtering, for rounding 
fractional solutions to linear programming relaxations. As one application of this 
technique for designing approximation algorithms, they gave another 0(log n)- 
approximation algorithm for the uncapacitated facility location problem. Fur- 
thermore, Lin and Vitter gave an algorithm for the fc-median problem that finds 
a solution for which the objective is within a factor of 1 -I- e of the optimum, but 
is infeasible since it opens (1-1- l/e)(ln n-l- 1)A: facilities. Both of these results hold 
for the general setting; that is, the input points need not lie in a metric space. 
In a companion paper, Lin and Vitter [13] focused attention on the metric case, 
and showed that for the fc-median problem, one can And a solution of cost no 
more than 2(1 -|- e) times the optimum, while using at most (1 -I- l/e)A: facilities. 

The recent spate of results derive algorithms that can be divided, roughly 
speaking, into three categories. There are rounding algorithms that rely on lin- 
ear programming in the same way as the work of Lin and Vitter, in that they 
first solve the linear relaxation of a natural integer programming formulation 
of the problem, and then round the optimal LP solution to an integer solution 
of objective function value no more than factor of p greater, thereby yielding 
a p-approximation algorithm. The second type of algorithm also relies on the 
linear programming relaxation, but only in an implicit way; in a primal-dual 
algorithm, the aim is to simultaneously derive a feasible integer solution for the 
original problem, as well as a feasible solution to the dual linear program to its 
linear relaxation. If one can show that the objective function value of the former 
always is within a factor of p of the latter, then this also yields a p-approximation 
algorithm. Finally, there are local search algorithms, where one maintains a fea- 
sible solution to the original problem, and then iteratively attempts to make 
a minor modification (with respect to a prescribed notion of “neighboring” so- 
lutions) so as to yield a solution of lower cost. Eventually, one obtains a local 
optimum; that is, a solution of cost no more than that of each of its neighboring 
solutions. In this case, one must also derive the appropriate structural proper- 
ties in order to conclude that any locally optimal solution is within a factor of 
p of the global optimum. These three classes of algorithms will immediately be 
blurred; for example, some algorithms will start with an LP rounding phase, but 
end with a local search phase. 
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One other class of algorithmic techniques should also be briefly mentioned. 
Arora, Rao, and Raghavan [1] considered these two problems for geometrically- 
defined metrics. For the 2-dimensional Euclidean case of the fc-median problem 
(either when the medians must be selected from among the input points, or 
when they are allowed to be selected arbitrarily from the entire space) and 
the uncapacitated facility location problem, they give a randomized polynomial 
approximation scheme; that is, they give a randomized (1-1- e)-approximation 
algorithm, for any fixed e > 0. No such schemes are likely to exist for the general 
metric case: Guha and Khuller [7] proved lower bounds, respectively, of 1.463 
and 1-1- 1/e (based on complexity assumptions, of course) for the uncapacitated 
facility location problem and fc- median problems. 



2 LP Rounding Algorithms 

The first approximation algorithm with a constant performance guarantee for 
the (metric) uncapacitated facility problem was given by Shmoys, Tardos, and 
Aardal [17]. Their algorithm is an LP rounding algorithm; they give a natural 
extension of the techniques of Lin and Vitter [13] to yield an algorithm with 
performance guarantee equal to 3/(1 — e~^) « 3.16. 

Guha and Khuller [7] observed that one can strengthen the LP relax;ation 
by approximately guessing the proportion of the overall cost incurred by the 
facilities in the optimal solution, and adding this as a constraint. Since there are 
only a polynomial number of reasonably-spaced guesses, we can try them all (in 
polynomial time) . Guha and Khuller showed that by adding a local search phase, 
starting with the solution obtained by rounding the optimum to the “right” LP, 
one obtains a 2.41-approximation algorithm. For their result, the local search 
is extremely simple; one checks only whether some additional facility can be 
opened so that the overall cost decreases, and if so, one adds the facility that 
most decreases the overall cost. 

The LP rounding approach was further strengthened by Ghudak and Shmoys 
[5,6] who used only the simple LP relaxation (identical to the one used by Lin 
and Vitter and Shmoys, Tardos, and Aardal), but relied on stronger information 
about the structure of optimal solutions to the linear programming relaxation to 
yield a performance guarantee of 1 -I- 2/e « 1.74. Another crucial ingredient in 
their approach is that of randomized rounding, which is a technique introduced 
by Raghavan and Thompson [16] in the context of multicommodity flow; in 
this approach, the fractional values contained in the optimal LP solution are 
treated as probabilities. Ghudak and Shmoys showed how to incorporate the 
main decomposition results of [17] so as to obtain a variant that might best be 
called clustered randomized rounding. Ghudak and Shmoys also showed that the 
technique of Guha and Khuller could be applied to their algorithm, and by doing 
so, one improves the performance guarantee by a microscopic amount. 

The first constant performance guarantee for the /c-median problem also re- 
lied an LP rounding approach: Gharikar, Guha, Tardos, and Shmoys [4] gave 
a 20/3-approximation algorithm. A wide class of combinatorially-defined lin- 
ear programs has the property that there exists an optimal solution with the 
property that each variable is equal to 0, 1/2, or 1; as will be discussed by 
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Hochbaum in another invited talk at this workshop, this property often yields 
a 2-approximation algorithm as a natural corollary. Unfortunately, the linear 
relaxation used for the fc-median problem does not have this property. However, 
the algorithm of [4] is based on the idea that one can reduce the problem to an 
input for which there is such an 1/2-integral solution, while losing only a con- 
stant factor, and then subsequently round the 1/2-integral solution to an integer 
one. 

3 Primal-Dual Algorithms 

In one of the most exciting developments in this area, Jain and Vazirani [9] 
gave an extremely elegant primal-dual algorithm for the uncapacitated facility 
location problem; they showed that their approach yields a 3-approximation 
algorithm, and it can be implemented to run in O(n^logn) time. In fact, Jain 
and Vazirani proved a somewhat stronger performance guarantee; they showed 
that if we overcharge each facility used in the resulting solution by a factor of 3, 
then the total cost incurred is still within a factor of 3 of the optimum cost for the 
unaltered input. In fact, this overcharging is also within a factor of the 3 of the 
value of the LP relaxation with the original cost data. Mettu and Plaxton [15] 
give a variant of this algorithm (which is not explicitly a primal-dual algorithm, 
but builds substantially on its intuition) that can be implemented to run in 
O(n^) time, which is linear in the size of the input; this variant does not have 
the stronger guarantee (but is still a 3-approximation algorithm). 

This stronger performance guarantee for the uncapacitated facility location 
problem has significant implications for the fc-median problem. One natural 
connection between the two problems works as follows: the facility costs can 
be viewed as Lagrangean multipliers that enforce the constraint that exactly 
k facilities are used. Suppose that we start with an instance of the fc-median 
problem, and define an input to the uncapacitated facility location problem by 
letting C = T = N , and setting the cost of each facility to be a common value 

If ^ = 0, then clearly all facilities will be opened in the optimal solution, 
whereas for a sufficiently large value of </>, only one facility will be opened in the 
optimal solution. A similar trade-off curve will be generated by any (reasonable) 
algorithm for the uncapacitated facility location problem. 

Jain and Vazirani observed that if their approximation algorithm is used, and 
(j) is set so that the number of facilities opened is exactly k, then the resulting 
solution for the fc-median problem has cost within a factor of 3 of optimal. 
Unfortunately, the trade-off curve need not be continuous, and hence it is possible 
that no such value of (p exists. However, if this unlucky event occurs, then one 
can generate two solutions from essentially equal values of </>, where one solution 
has more than k facilities and the other has fewer; Jain and Vazirani show how 
to combine these two solutions to obtain a new solution with k facilities of cost 
that is within a factor of 6 of optimal. Charikar and Guha [3] exploited the fact 
that these two solutions have a great deal of common structure in order to give 
a refined analysis; in this way, they derive a 4- approximation algorithm for this 
problem. 

Mettu and Plaxton [15] also show how to extend their approach to the un- 
capacitated facility location problem to obtain an 0(n^)-time algorithm for the 
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/c-median problem; in fact, their algorithm has the miraculous property that it 
outputs a single permutation of the input nodes such that, for any k, the first 
k nodes constitute a feasible solution within a constant factor of the optimal 
k-node solution. Thorup [20] has also given linear-time constant-factor approxi- 
mation algorithms for the fc-median problem; if the metric is defined with respect 
to a graph with m edges, then he gives a 12-|-o(l)-approximation algorithm that 
runs in 0(m) time. 



4 Local Search Algorithms 

Local search is one of the most successful approaches to computing, in practice, 
good solutions to NP-hard optimization problems. Indeed, many of the most 
visible methods to the general scientific community are variations on this theme; 
simulated annealing, genetic algorithms, even neural networks can all be viewed 
as coming from this family of algorithms. In a local search algorithm, more 
narrowly viewed, one defines a graph on the space of all (feasible) solutions, 
where two solutions are neighbors if one solution can be obtained from the other 
by a particular type of modification. One then searches for a node that is locally 
optimal, that is, whose cost is no more than each of its neighbors, by taking a 
walk in the graph along progressively improving nodes. 

Korupolu, Plaxton, and Rajaraman [10] analyzed a local search algorithm in 
which two nodes are neighbors if exactly one facility is added (or, symmetrically, 
deleted) in comparing the two solutions, or both one facility is added and one 
facility is deleted. They show that a local optimum with respect to this neigh- 
borhood structure has cost within a factor of 5 of optimal. One further issue 
needs to be considered in deriving an approximation algorithm: the algorithm 
needs to run in polynomial time. This can be accomplished in a variety of ways, 
but typically involves computing a “reasonably good” solution with which to 
start the search, as well as insisting that a step not merely improve the cost, 
but improve the cost “significantly” . In this way, they show that, for any e > 0, 
local search can yield a (5 -I- e)-approximation algorithm. Charikar and Guha 
[3] give a more sophisticated neighborhood structure that yields another simple 
3-approximation algorithm. Furthermore, they show that rescaling the relative 
weight of the facility and the assignment costs leads to a 2.415-approximation 
algorithm based on local search. Finally, they show that all of the ideas above 
can be combined: LP rounding (on multiple LPs augmented as in [7] and aug- 
mented with a greedy improvement phase), primal-dual algorithms (improved 
with the rescaling idea), and the improved local search algorithm. In this way, 
they improve the performance guarantee of 1.736 due to Chudak and Shmoys to 
1.728. This might be the best guarantee known as of this writing, but for these 
two problems, it seems unlikely to be the last word. 
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Abstract. Given a directed graph G and an edge weight function w : 
E(G) R+, the maximum directed cut problem (max dicut) is that of 
finding a directed cut 5[X) with maximum total weight. In this paper we 
consider a version of MAX DICUT — MAX DICUT with given sizes of parts 
or MAX DICUT WITH GSP — whose instance is that of MAX DICUT plus a 
positive integer p, and it is required to find a directed cut 5(A) having 
maximum weight over all cuts 5(A) with |A| = p. It is known that by 
using semidefinite programming rounding techniques MAX DICUT can be 
well approximated — the best approximation with a factor of 0.859 is 
due to Feige and Goemans. Unfortunately, no similar approach is known 
to be applicable to MAX DICUT with GSP. This paper presents an 0.5- 
approximation algorithm for solving the problem. The algorithm is based 
on exploiting structural properties of basic solutions to a linear relaxation 
in combination with the pipage rounding technique developed in some 
earlier papers by two of the authors. 



1 Introduction 

Let G be a directed graph. A directed cut in G is defined to be the set of arcs 
leaving some vertex subset X (we denote it by 5(A)). Given a directed graph G 
and an edge weight function w : E(G) K.+, the maximum directed cut problem 
(max dicut) is that of finding a directed cut 5(A) with maximum total weight. 
In this paper we consider a version of max dicut (max dicut with given sizes 
of parts or max dicut with gsp) whose instance is that of MAX dicut plus a 
positive integer p, and it is required to find a directed cut 5(A) having maximum 
weight over all cuts 5(A) with |A| = p. max dicut is well-known to be NP-hard 
and so is MAX dicut with gsp as the former evidently reduces to the latter. 

The NP-hardness of MAX dicut follows from the observation that the well- 
known undirected version of MAX dicut — the maximum cut problem (max 
cut), which is on the original Karp’s list of NP-complete problems — reduces 

’’ Supported in part by the Russian Foundation for Basic Research, grant 99-01-00601. 
** Supported in part by the Russian Foundation for Basic Research, grant 99-01-00510. 



K. Jansen and S. Khuller (Eds.): APPROX 2000, LNCS 1913, pp. 34—41, 2000. 
© Springer- Verlag Berlin Heidelberg 2000 




An Approximation Algorithm for MAX DICUT 



35 



to MAX DICUT by substituting each edge for two oppositely oriented arcs. This 
means that for both problems there is no choice but to develop approximation 
algorithms. Nevertheless, this task turned out to be highly nontrivial, as for 
a long time it was an open problem whether it is possible to design approx- 
imations with factors better than trivial 1/2 for MAX CUT and 1/4 for max 
DICUT. Only quite recently, using a novel technique of rounding semidefinite re- 
laxations Goemans and Williamson [5] worked out algorithms solving max cut 
and MAX DICUT approximately within factors of 0.878 and 0.796 respectively. 
A bit later Feige and Goemans [3] developed an algorithm for MAX dicut with 
a better approximation ratio of 0.859. Recently, using a new method of round- 
ing linear relaxations (pipage rounding) Ageev and Sviridenko [1] developed an 
0.5-approximation algorithm for the version of MAX CUT in which the parts of a 
vertex set bipartition are constrained to have given sizes (max cut with given 
sizes of parts or max cut with GSp) . The paper [2] presents an extension of this 
algorithm to a hypergraph generalization of the problem. Feige and Langberg 
[4] combined the method in [1] with the semidefinite programming approach 
to design an (0.5-1- e)-approximation for MAX CUT with GSP where e is some 
unspecified small positive number. 

It is easy to see that max cut with gsp reduces to max dicut with gsp 
in the same way as max cut reduces to max dicut. However, unlike MAX CUT 
WITH GSP, MAX DICUT WITH GSP provides no possibilities for a straightforward 
application of the pipage rounding since the F/L lower bound condition in the 
general description of the method (see Section 2) does not hold any more for 
every constant C. Fortunately, the other main condition — the £-convexity — still 
holds. 

The main result of this paper is an 0.5-approximation algorithm for solving 
MAX DICUT WITH GSP. It turns out that to construct such an algorithm one needs 
to carry out a more profound study of the problem structure. A heaven-sent 
opportunity is provided by some specific properties of basic optimal solutions to 
a linear relaxation of the problem (Theorem 1). At this point we should notice 
the papers of Jain [6], and Melkonian and Tardos [7] where exploiting structural 
properties of basic solutions was also crucial in designing better approximations 
for some network design problems. 

The resulting algorithm (DIRGUT) is of rounding type and as such consists 
of two phases: the first phase is to find an optimal (fractional) solution to a 
linear relaxation; the second (rounding) phase is to transform this solution to 
a feasible (integral) solution. A special feature of the rounding phase is that 
it uses two different rounding algorithms (ROUND 1 and ROUND2) based on 
the pipage rounding method and takes the best solution for the output. The 
worst-case behavior analysis of the algorithm heavily relies on Theorem 1 . 

2 Pipage Rounding: A General Scheme 

In this section we give a general description of the pipage rounding method as 
it was presented in [1] . 
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Assume that a problem under consideration can be formulated as the follow- 
ing nonlinear binary program: 



F{x) 




(1) 


n 


'^Xi=p, 




(2) 


i=l 


0 < Xi < 1, i = 1, . . 


• ,n, 


(3) 


XiG{0,l}, i = l,.. 


. ,n 


(4) 



where p is a positive integer, F(x) is a function defined on the rational points 
X = (xi) of the n-dimensional cube [0, 1]" and computable in polynomial time. 
Assume further that one can associate with F(x) another function L{x) which 
is defined and polynomially computable on the same set, coincides with F{x) on 
binary x satisfying (2), and the program (which we call a nice relaxation) 

max L{x) 

n 

s. t. = p, 

i=l 

0<a;i<l, i = l,...,n 

is polynomially solvable. We now state the following main conditions: 

F/L lower bound condition: there exists C > 0 such that F{x) > CL{x) for 
each X G [0, 1]”; 

e-convexity condition: the function 

ip{e,x,i,j) = F{xi, . .. ,Xi + s,. . . ,Xj - s,. . . ,Xn) (8) 

is convex with respect to e G [— minjxi,! — a:j},min{l — Xi,Xj}] for each 
pair of indices i and j and each x G [0, 1]”. 

We next describe the pipage rounding procedure. Its input is a fractional 
solution X satisfying (2)-(3) and its output is an integral solution x satisfying 
(2)-(4) and having the property that F{x) > F{x). The procedure consists of 
uniform ‘pipage steps’. We describe the first step. If the solution x is not binary, 
then due to (2) it has at least two different components Xi and Xj with values 
lying strictly between 0 and 1. By the e-convexity condition, ip{e,x,i,j) > F{x) 
either for e = minjl — Xi,Xj}, or for e = — minjcci, 1 — Xj}. At the step x is 
replaced with a new feasible solution x' = (xi, ... ,Xi + e, . . . ,Xj — s, . . . , x„). 
By construction x' has smaller number of non-integer components and satisfies 
F{x') > F{x). After repeating the ‘pipage’ step at most n — 1 times we arrive 
at a binary feasible solution x with F{x) > F{x). Since each step can be clearly 
implemented in polynomial time, the running time of the described procedure is 
polynomially bounded. 



(5) 

( 6 ) 
(7) 
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Suppose now that x is an optimal solution to (5)-(7) and both the e-convexity 
and the F/L lower bound condition hold true. Then 

F{S;) > F{x) > CL{x) > CF* 

where F* is the optimal value of (l)-(4). Thus the combination of a polynomial- 
time procedure for solving (5)-(7) and the pipage rounding gives a (7-approxima- 
tion algorithm for solving (l)-(4). Note that using any polynomial-time proce- 
dure that finds a solution x satisfying (2)-(3) and F{x) > CF*, instead of a 
procedure for solving (5)-(7), also results in a C-approximation algorithm. 

3 Application: max dicut with gsp 

In this section we show an implementation of the above scheme in the case of 
MAX DICUT WITH GSP and, in addition, specify the character of obstacles to the 
direct application of the pipage rounding method. 

In what follows G = (V, A) is assumed to be the input (directed) graph with 
\V\ = n. 

First, note that MAX dicut with GSP can be formulated as the following 
nonlinear binary program: 

max F{x) = WijXi{\ — Xj) 
ij^A 

S. t. ^^Xi=p, 
ieu 

Xi G {0, 1}, Vi G V. 

Second, just like max cut with gsp in [1], MAX dicut with gsp can be 



formulated as the following integer program: 

max E (9) 

ij&A 

s. t. Zij < Xi for all ij G A, (10) 

Zij < ^ — Xj for all ij G A, (11) 

Y^Xi=p, ( 12 ) 

I 

0 < Xi < 1 for all i G V, (13) 

Xi, Zkj G {0, 1} for all i G V, kj G A. (14) 



Observe now that the variables Zij can be excluded from (9)-(13) by setting 
Zij = minjxi, (1 — Xj)} for all ij G A. 

Hence (9)-(13) is equivalent to maximizing 

T(x) = ^ Wij min{xi, (1 - x^)} (15) 

ij&A 



subject to (12), (13). 
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Thus we have functions F and L that can be considered as those involved in 
the description of the pipage rounding (see Section 2). Moreover, the function 
F obeys the e-convexity condition as the function defined by (8) is 

a quadratic polynomial in e with a nonnegative leading coefficient for each pair 
of indices i and j and each x G [0,1]". Unfortunately, the other, F/L lower 
bound, condition does not necessarily hold for every C > 0. We present below 
an example showing that the ratio F{x)/ L{x) may be arbitrarily close to 0 even 
when the underlying graph is bipartite. 

Example 1. Let V = U1UV2UV3 where |Vi| = k, IV2I = jUaj = 2. Let A = A 1 UA 2 
where Ai is the set of 2 k arcs from Vi to V2 inducing a complete bipartite graph 
on (Vi, V2) and A 2 is the set of 4 arcs from V 2 to V3 inducing a complete bipartite 
graph on (V2, V3). 

The optimal value of L is 2p and it can be obtained in more than one way. 
One way is to let Xi = r = for i G V\, Xi = 1 — r ioY i G V 2 and Xi = 0 

for i G V 3 . We get that when k and p tend to infinity (for example, k = p^), 
F = 2kr^ + 4(1 — r) tends to 4 and F/L tends to 0. 

Note that the same can be done with jVsj > 2 and then the above solution 
will be the unique optimum. 

Thus a direct application of the pipage rounding method does not provide a 
constant-factor approximation. 

Example 1 can also be used to show that the greedy algorithm (at each 
step add a vertex which increases most or decreases least the weight of the cut) 
does not yield any constant-factor approximation. For this instance the greedy 
algorithm may first choose the vertices of V 2 and then no more arcs can be added 
and a solution with only 4 arcs will be the outcome (while the optimal one is to 
choose p vertices from V\ , which gives a cut of size 2p) . 



4 The Structure of Basic Solutions 

The following statement is a crucial point in constructing a 0.5-approximation 
for MAX DICUT WITH GSP. 

Theorem 1. Let (x,z) be a basic feasible solution to the linear relaxation (9)- 
(13). Then 



Xi G {0,(5, 1/2, 1 — 6 , 1} for each i, 



(16) 



for some 0 < S < 1/2. 

Proof. Let {x, z) be a basic feasible solution. Then by definition of basic solution 
(x, z) is the unique solution to the system of linear equations obtained from the 
subset of constraints (10) which are active for {x,z), i.e. those which hold with 
equality. First, observe that for a variable Zij either both Zij < Xi and Zij < 1 — xj 
hold with equalities or exactly one holds with equality and the other with strict 
inequality. In the former case we exclude Zij by replacing these equalities with 
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the single equality Xi+Xj = 1 . In the latter case we delete the equality from the 
linear system. The reduced system will have the following form: 

Vi + Vj = l for ij G A' C A, (17) 

^Vi=P. ( 18 ) 

I 

t/i = 0 for every i such that Xi = 0, (19) 

t/i = 1 for every i such that Xi = 1. (20) 

By construction x is the unique solution to the system (17)-(20). Now remove 
all components of x equal to either 0 or 1 or 1/2 (equivalently, fix yi = Xi for 
each i such that Xi G {0,1/2,!}) and denote the set of the remaining indices 
by I* . Denote the subvector of x consisting of the remaining components by x' . 
Then we get a system of the following form: 

y'i + 2/i = 1 for ij G A" C A', 

i 

where p' < p. By construction, x' is the unique solution to this system. It follows 
that we can choose |/*| independent equations from (21)-(22). We claim that any 
subsystem of this sort must contain the equation (22). Assume to the contrary 
that |/*| equations from the set (21) form an independent system. Consider the 
(undirected) subgraph H of G (we discount orientations) corresponding to these 
equations. Note that \E{H)\ = |/*|. Since yi yf 1/2 for every i G I*, H does 
not contain odd cycles. Moreover, H cannot have even cycles as the subsystem 
corresponding to such a cycle is clearly dependent. Thus H is acyclic. But then 
\E{H)\ ^ 1^*1 ~ 1) contradiction. Now fix |/*| independent equations from 
(21)-(22). Then we have the system: 

y't + = 1 for ij G A*, 

Yy'^=p' 

iei* 

where A* C A'. Since all equations in the system (23)-(24) are independent, 
1 7*1 = |A*| + 1. Above we have proved that the subgraph induced by A* is 
acyclic, which together with |7*| = |A*| + 1 implies that it is a tree. It follows 
that the components of x with indices in I* split into two sets — those equal to 
some 0 < i5 < 1/2 and those equal to 1 — <5. □ 

5 Algorithm DIRCUT 

Section 3 demonstrates that max dicut with GSP in the general setting does 
not admit a direct application of the pipage rounding method. In this section we 
show that by using Theorem 1 and some tricks one is able to design not only a 



(23) 

(24) 



( 21 ) 

(22) 
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constant factor but even a 0.5-approximation for solving MAX dicut with GSP. 
Moreover, the performance bound of 0.5 cannot be improved using different 
methods of rounding as the integrality gap of (9)-(13) can be arbitrarily close 
to 1/2 (this can be shown exactly in the same way as it was done for max cut 
WITH GSP in [1]). 

Observe first that for any a,b G [0, 1], min{a, &}max{a, b} = ah. Thus 



Xi{l - Xj) 

minjxi, 1 — Xj} 



max{xi, 1 — Xj}. 



(25) 



Algorithm DIRCUT consists of two phases: the first phase is to find an op- 
timal (fractional) basic solution to the linear relaxation (9)-(13); the second 
(rounding) phase is to transform this solution to a feasible (integral) solution. 
The rounding phase runs two different rounding algorithms based on the pi- 
page rounding method and takes the best solution for the output. Let {x, z) 
denote a basic optimal solution to (9)-(13) obtained at the first phase. Re- 
call that, by Theorem 1, the vector x satisfies (16). Set Vi = {i : Xi = 5}, 
V 2 = {i ■ Xi = 1 — 5}, V 3 = {i : Xi = 1/2}, V 4 = {f : Xj = 0 or 1}. For ij G A, call 
the number ruy minjxi, (1 — Xj)} the contributed weight of the arc ij. Denote 
by (b J = 1, 2, 3, 4) the sum of contributed weights over all arcs going from Vi 
to Vj. 

Set lo = hi + hi + I 22 + hs + hi, h = I 33 + I 13 + I 32 , h = X)i=i (^*4 + hi). 

The second phase of the algorithm successively calls two rounding algorithms 
— ROUND 1 and ROUND2 — and takes a solution with maximum weight for 
the output. 

ROUND 1 is the pipage rounding applied to the optimal basic solution x. By 
using the property (16) and the inequality (25), it is easy to check that ROUND 1 
outputs an integral solution of weight at least F{x) = SI 12 + (1 — 5)?o -I- ^i/2 -|- / 2 . 

Algorithm ROUND2 is the pipage rounding applied to a different fractional 
solution x' which is obtained by an alteration of x. 



Algorithm ROUND2 



Define a new vector x' by the following formulas: 

r min{l,5 -|- (1 - 5 )|U 2 |/|Ui|} ifiGUi, 

x' = < max{{0, (1 - 5) - (1 - 5 )|Ui|/|U 2 |} if i G U 2 , (26) 

[x, iftG U\(UiUU 2 ). 



Apply the pipage rounding to x' . 
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Analysis. The vector x' is obtained from x by redistributing uniformly the 
values from the vertices in V 2 to the vertices in Vi. By construction x' is feasible. 
Applying the pipage rounding to x! results in an integral feasible vector of weight 
at least F{x'). We claim that F{x') > I 12 + h/2. Consider first the case when 
\Vi\ > 1^21- Then by (26), x^ = 0 for all i G V 2 and x'^ > 5 for all i G Vi. 
So, by (25), F{x') > I 32 + I 12 + l/2(^i3 + ^ 33 )- Now assume that |Vi| < |U 2 |- 
Then by (26), x' = 1 and x' < 1 — <5 for all i € Vi. Hence, by (25), F(x') > 
li 3 + l \2 + 1/2(/32 + ^ 33 ). So, in either case F{x') > l \2 + h/2. Thus ROUND2 
outputs a solution of weight at least I 12 + h/2. 

By the description of DIRCUT its output has weight at least 

max{li 2 + h/2, SI 12 T (1 — <^)^o T h/2 + ^ 2 }; 

which is bounded from below by 

q = max{?i 2 , SI 12 + (1 — S)l*} + h/2 

where I* = h+h - Recall that 0 < 5 < 1/2. Hence, if ^12 > I*, then q = li2+h/2 > 
1/2(/i2+C+?i) and if I 12 < h , then q = J/i 2 +(l — S)l* -\-h/2 > l/ 2 (/i 2 +C)+?i/ 2 . 
Thus, in either case algorithm DIRCUT outputs a solution of weight at least 
1 / 2 (/i 2 + Iq + h + h), which is at least half of the optimum. 
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Abstract. We consider a benefit model for on-line preemptive schedul- 
ing. In this model jobs arrive to the on-line scheduler at their release 
time. Each job arrives with its own execution time and its benefit func- 
tion. The flow time of a job is the time that passes from its release to 
its completion. The benefit function specifies the benefit gained for any 
given flow time. A scheduler’s goal is to maximize the total gained ben- 
efit. We present a constant competitive ratio algorithm for that model 
in the uniprocessor case for benefit functions that do not decrease too 
fast. We also extend the algorithm to the multiprocessor case while main- 
taining constant competitiveness. The multiprocessor algorithm does not 
use migration, i.e., preempted jobs continue their execution on the same 
processor on which they were originally processed. 



1 Introduction 

1.1 The Basic Problem 

We are given a sequence of n jobs to be assigned to one machine. Each job j has 
a release time Vj and a length or execution time Wj. Each job is known to the 
scheduler only at its release time. The scheduler may schedule the job at any 
time after its release time. The system allows preemption, that is, the scheduler 
may stop a job and later continue running it. Note that the machine can process 
only one job at a time. If job j is completed at time Cj then we define its flow 
time as fj = Cj — rj (which is at least wj). 

In the machine scheduling problem there are two major models. The first is 
the cost model where the goal is to minimize the total (weighted) flow time. The 
second is the benefit model, where each job comes with its own deadline, and 
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the goal is to maximize the benefit of jobs that meet their deadline. Both models 
have their disadvantages and the performance measurement is often misleading. 
In the cost model, a small delay in a loaded system keeps interfering with new 
jobs. Every new job has to wait a small amount of time before the system is free. 
The result is a very large increase in total cost. The above might suggest that the 
benefit model is favorable. However, it still lacks an important property: in many 
real cases, jobs are delayed by some small constant and should therefore reduce 
the overall system performance, but only by some small factor. In the standard 
benefit model, jobs that are delayed beyond their deadline cease to contribute 
to the total benefit. Thus, the property we are looking for is the possibility to 
delay jobs without drastically harming overall system performance. 

We present a benefit model where the benefit is a function of its flow time: 
the longer the processing of a job takes, the lower its benefit is. More specifically, 
each job j has an arbitrary monotone non-increasing non-negative benefit density 
function Bj{t) for t > wj and the benefit gained is WjBj{fj) where fj is its flow 
time. Note that the benefit density function may be different for each job. The 
goal of the scheduler is to schedule the jobs as to maximize the total benefit, i.e., 

WjBj(fj) where fj is the flow time of job j. Note that the benefit density 
function of different jobs can be uncorrelated and the ratio between their values 
can be arbitrary large. However, we restrict each Bj{t) to satisfy 



< C 

B]{t + Wj) - 

for some fixed constant C. That is, if we delay a job by its length then we loose 
only a constant factor in its benefit. 

An on-line algorithm is measured by its competitive ratio, defined as: 

OPT(I) 

where A{I) denotes the benefit gained by the on-line algorithm A on input I, 
and OPT {I) denotes the benefit gained by the optimal schedule. 

As with many other scheduling problems, the uniprocessor model presented 
above can be extended to a multiprocessor model where instead of just one ma- 
chine, we are given m identical machines. A job can be processed by at most 
one machine at a time. The only definition that needs further explanation is the 
definition of preemption. In the multiprocessor model we usually allow the sched- 
uler to preempt a job and later continue running it on a different machine. That 
operation, known as migration, can be costly in many realistic multiprocessor 
systems. A desirable property of a multiprocessor scheduler is that it would not 
use migration, i.e., once a job starts running on a machine, it continues running 
there up to its completion. Our multiprocessor algorithm has that property with 
no significant degradation in performance. 
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1.2 The Results in this Paper 

The main contribution of the paper is in defining a general model of benefit 
and providing a constant competitive algorithm for this model. We begin with 
describing and analyzing the uniprocessor scheduling algorithm. Later, we extend 
the result to the multiprocessor case. Our multiprocessor algorithm does not use 
migration. Nevertheless, there is no such restriction on the optimal algorithm. In 
other words, the competitiveness result is against a possibly migrative optimal 
algorithm. 



1.3 Previous Work 

The benefit model of real-time scheduling presented above is a well studied one. 
An equivalent way of looking at deadlines is to look at benefit density functions 
of the following ‘stair’ form: the benefit density for flow times which are less or 
equal to a certain value are fixed. Beyond that certain point, the benefit density 
is zero. The time in which the flow time of a job passes that point is the job’s 
deadline. Such benefit density functions do not match our requirements because 
of their sharp decrease. 

As a result of the firm deadline, the real-time scheduling model is hard to 
approximate. The optimal deterministic competitive ratio for the single processor 
case is 0{<P) where ^ is the ratio between the maximum and minimum benefit 
densities [3, 4, 7]. For the special case where <P = 1, there is a 4-competitive 
algorithm. The optimal randomized competitive ratio for the single processor 
case is 0(min(log log Z\)) where A is the ratio between the longest and shortest 
job [6]. 

For the multiprocessor case, Koren and Shasha [8] show that when the num- 
ber of machines is very large a 0(log <P) competitive algorithm is possible. That 
result is shown to be optimal. Their algorithm achieves that competitive ratio 
without using migration. 

Another related problem is the problem of minimizing the total flow time. 
Recall that in this problem individual benefits do not exist and the goal function 
is minimizing the sum (or equivalently, average) of flow times over all jobs. 
Unlike real-time scheduling, the single processor case is solvable in polynomial 
time using the shortest remaining processing time first rule [2] . Using this rule, 
also known as SRPT, the algorithm assigns the jobs whose remaining processing 
time is the lowest to the available machines. 

Minimizing the total flow time with more than one machine becomes 
NP — hard [5]. In their paper [9], Leonardi and Raz analyzed the performance 
of the SRPT algorithm. They showed that it achieves a competitive ratio of 
0(log(min{Z\, — })) where A is the ratio between the longest and shortest pro- 
cessing time. They also show that SRPT is optimal with two lower bounds for 
on-line algorithms, I2(log and l7(logZ\). A fundamental property of SRPT 
is the use of migration. In a recent paper [I], an algorithm which achieves al- 
most the same competitive ratio is shown. This algorithm however does not use 
migration. 
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2 The Algorithm 

In order to describe the algorithm we define three ‘storage’ locations for jobs. 
The first is the pool where new jobs arrive and stay there until their processing 
begins. Once the scheduler decides a job should begin running, the job is removed 
from the pool and pushed to the stack where its processing begins. Two different 
possibilities exist at the end of a job’s life cycle. The first is a job that is completed 
and can be popped from the stack. The second is a job that after staying too 
long in the stack got thrown into the garbage collection. The garbage collection 
holds jobs whose processing we prefer to defer. The actual processing can occur 
when the system reaches an idle state. Throwing a job to the garbage collection 
means we gain nothing from it and we prefer to throw it in order to make room 
for other jobs. 

The job at the top of the stack is the job that is currently running. The other 
jobs in the stack are preempted jobs. For each job j denote by Sj the time it 
enters the stack. We define its breakpoint as the time sj + 2wj. In case a job is 
still running when its breakpoint arrives, it is thrown to the garbage collection. 
We also define priorities for each job in the pool and in the stack. The priority 
of job j at time t is denoted by dj{t). For t < sj it is Bj{t + wj — rj) and for 
time t > Sj it is dj = Bj{sj + Wj — rj). In other words, the priority of a job in 
the pool is its benefit density if it would have run to completion starting at the 
current time t. Once it enters the stack its priority becomes fixed, i.e. remains 
the priority at the time Sj . 

We describe Algorithm ALGl as an event driven algorithm. The algorithm 
takes action at time t when a new job is released, when the currently running 
job is completed or when the currently running job reaches its breakpoint. If 
some events happen at the same time we handle the completion of jobs first. 

— A new job I arrives. In case di{t) > Adk where k is the job at the top of 
the stack or in case the stack is empty, push job I to that stack and run it. 
Otherwise, just add job I to the pool. 

— A job at the top of the stack completes or reaches its breakpoint. Then, pop 
jobs from the top of the stack and insert them to the garbage collection as 
long as their breakpoints have been reached. Unless the stack is empty, let k 
be the index of the new job at the top of the stack. Continue running job k 
only if dj {t) < 4dk for all j in the pool. In any other case, get the job from 
the pool with maximum dj(t), push it into the stack and run it. 

We note several facts about this algorithm: 

Observation 21. Every job enters the stack at some point in time. Then, by 
time Sj + 2wj, it is either completed or reaches its breakpoint and gets thrown to 
the garbage collection. 

Observation 22. The priority of a job is monotone non-increasing over time. 
Once the job enters a stack, its priority remains fixed until it is completed or 
thrown. At any time the priority of each job in a stack is at least 4 times higher 
than the priority of the job below it. 
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Observation 23. Whenever the pool is not empty, the machine is not idle, that 
is, the stack is not empty. Moreover, the priority of jobs in the pool is always at 
most 4 times higher than the priority of the currently running job. 

3 The Analysis 

We begin by fixing an input sequence and hence the behavior of the optimal 
algorithm and the on-line algorithm. We denote by f^^^ the flow time of job j 
by the optimal algorithm. As for the on-line algorithm, we look at the benefit 
of jobs which were not thrown to the garbage collection. Denote the set of these 
jobs by A. So, for j G A, let f^^ be the fiow time of job j by the on-line 
algorithm. By definition. 



j 

and 

j&A 

We also define the pseudo-benefit of a job j by Wjdj. That is, each job donates a 
benefit of wjdj as if it runs to completion without interruption from the moment 
it enters the stack. Define the pseudo-benefit of the online algorithm as 

3 

For 0 < t < Wj we define Bj{t) = Bj{wj). In addition, we partition the set 
of jobs J into two sets, Ji and J 2 . The first is the set of jobs which are still 
processed by the optimal scheduler at time sj, when they enter the stack. The 
second is the set of jobs which have been completed by the optimal scheduler 
before they enter the stack. 

Lemma 1. For j G Ji, < C ■ VpsEUDO 

Proof. We note the following: 

WjBjif^^^) < C ■ WjBj{f°^'^ + Wj) < C ■ WjBj{sj - rj + wj) = C ■ wjdj 

where the first inequality is by our assumptions on Bj and the second is by our 
definition of Ji. Summing over jobs in Ji: 

^ W,Bj{ffP^) < C ^ W,dj < C ■ VpSEUDO 
jeJi jeJi 



Lemma 2. For j G J 2 , 
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Proof. For each j G J 2 we define its ‘optimal processing time’ as: 
Tj = {t\job j is processed by OPT at time t}. 



jeJ2 



jeJ2 






Bj(t-rj)dt 

3&J2 

<c j2 [ d,{f)dt 



J&J 2 



iteTj 



According to the definition of J 2 , during the processing of job j G J 2 by the 
optimal algorithm, the on-line algorithm still keeps the job in its pool. By Ob- 
servation 23 we know that the job’s priority is not too high; it is at most 4 
times the priority of the currently running job and specifically, at time t G Tj its 
priority is at most 4 times the priority of the job at the top of the stack in the 
on-line algorithm. Denote that job by j{f). So, 



^ 4^7 • ^ ^ j dj^f^dt 



j^J2 






j^J2 



't^Tj 



< AC • / dj(^t)dt 

JtGUTj 

< 4C • J dj{^t)dt 

< 4C • ^ wjdj = 4C • 

j&J 



Corollary 1. 

Proof. Combining the two lemmas we get. 



V 



OPT 



j&Ji 3 &J 2 

< c ■ VPSEUDO + 4C • 



Lemma 3. y^^^isc/DO < 2C ■ 

Proof. We show a way to divide a benefit of C • between all the jobs such 
that the ratio between the gain allocated to each job and its pseudo gain is at 
most 2. 
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We begin by ordering the jobs such that jobs are preempted only by jobs 
appearing earlier in the order. This is done by looking at the preemption graph: 
each node represents a job and the directed edge (j, k) indicates j preempts 
job k at some time in the on-line algorithm. This graph is acyclic since the 
edge (j,k) exists only if dj > dk- We use a topological order of this graph in 
our construction. Jobs can only be preempted by jobs appearing earlier in this 
order. 

We begin by assigning a benefit of wjdj to any job j in A, the set of jobs not 
thrown to the garbage collection. At the end of the process the benefit allocated 
to each job, not necessarily in A, will be at least ^Wjdj. 

According to the order defined above, we consider one job at a time. Assume 
we arrive to job j. In case j G A, it already has a benefit of Wjdj assigned to 
it. Otherwise, job j got thrown to the garbage collection. This job entered the 
stack at time sj and left it at time Sj + 2wj. During that time the scheduler 
actually processed the job for less than wj time. So, during more than Wj time 
job j was preempted. For any job k running while job j is preempted, denote by 
Uk,j the set of times when job j is preempted by job k. Then, move a benefit of 
\Uk,j \ ■ dj from k to j. Therefore, once we finish with job j, its allocated benefit 
is at least wjdj. 

How much benefit is left allocated to each job j at the end of the process 
? We have seen that before moving on to the next job, the benefit allocated to 
job j is at least wjdj (whether or not j G A). When job j enters the stack at 
time Sj it preempts several jobs; these jobs appear later in the order. Since jobs 
are added and removed only from the top of the stack, as long as job j is in the 
stack the set of jobs preempted by it remains unchanged. Each job k of these 
jobs gets from j a benefit of at most Wjdk- However, since all of these jobs exist 
together with j in the stack at time Sj, the sum of their priorities is at most \dj 
(according to Observation 22). So, after moving all the required benefit, job j is 
left with at least \wjdj as needed. 

In order to complete the proof, 

3 3 

< 2 ^ Wjdj 

jeA 

< 2C ^ WjBj{sj - Tj + 2wj) 

j&A 

j&A 

< 2C • 



Theorem 1. Algorithm ALGl is lOC^ competitive. 
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Proof. By combining the previous lemmas, we conclude that 



yON > 



yPSEUDO 

2C 



yOPT 

- 10C2 ■ 



4 Multiprocessor Scheduling 

We extend Algorithm ALGl to the multiprocessor model. In this model, the 
algorithm holds m stacks, one for each machine, as well as m garbage collections. 
Jobs not completed by their deadline get thrown to the corresponding garbage 
collection. Their processing can continue later when the machine idle. As before, 
we assume we get no benefit from these jobs. The multiprocessor Algorithm 
ALG2 is as follows, 

— A new job I arrives. In case there is a machine such that di(t) > 4dk where 
k is the job at the top of its stack or its stack is empty, push job I to that 
stack and run it. Otherwise, just add job I to the pool. 

— A job at the top of a stack is completed or reaches its breakpoint. Then, 
pop jobs from the top of that stack as long as their breakpoints have been 
reached. Unless the stack is empty, let k be the index of the new job at the 
top of the stack. Continue running job k only if dj(t) < Adk for all j in the 
pool. In any other case, get the job from the pool with maximum dj(t), push 
it into that stack and run it. 

We define Ji and J 2 in exactly the same way as in the uniprocessor case. 

Lemma 4. For j G Ji, < C ■ Vpseudo 

Proof. Since the proof of Lemma 1 used the definition of Ji separately for each 
job, it remains true in the multiprocessor case as well. 

The following lemma extends Lemma 2 for the multiprocessor case: 

Lemma 5. For j G J 2 , ^j^j 2 ^ 4C • 

Proof. For each j G J 2 we define its ‘optimal processing time’ by machine i as: 
UJ = j is processed by OPT on machine i at time t}. 



J&J2 



jeJ2 



^ / Bj{t-rj)dt 

jeJ2 
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According to the definition of J 2 , during the processing of job j € J 2 by the 
optimal algorithm, the on-line algorithm still keeps the job in its pool. By Ob- 
servation 23 we know that the job’s priority is not too high; it is at most 4 times 
the priority of the currently running jobs and specifically, at time t for machine 
i such that t G Tj^t its priority is at most 4 times the priority of the job at the 
top of stack i in the on-line algorithm. Denote that job by So, 

E / d,(t)dt < ■ Y. Y / 

je J 2 l<i<m l<i<m 

< 4C • E f 

l<i<m dteUT^^i 

< 4C • ^ / dj(t,i)dt 

l<i<m ^ 

< 4C • ^ Wjdj = 4C • 

Lemma 6. < 2C ■ 

Proof. By using Lemma 3 separately on each machine we get the same result 
for the multiprocessor case. 

Combining all the result together: 

Theorem 2. Algorithm ALG2 for the multiprocessor case is lOC^ competitive. 
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Abstract. Certain tasks, like accessing pages on the World Wide Web, 
require duration that varies over time. This poses the following Variable 
Length Sequencing Problem, or VLSP-L. Let \i,j) denote + — 

1}. The problem is given by a set of jobs J' and the time-dependent 
length function X : J x [0,n) ^ L. A sequencing function a : J ^ \Q,n) 
assigns to each job j a time interval T^{j) when this job is executed; if 
a{j) = t then T„{j) = [t,t + X{j, t) ) . The sequencing is valid if these time 
intervals are disjoint. Our objective is to minimize the makespan, i.e. 
the maximum ending of an assign time interval. Recently it was shown 
VLSP-[0,n) is NP-hard and that VLSP-{1,2} can be solved efficiently. 
For a more general case of VLSP-{1, fc} an 2 — 1/fc approximation was 
shown. This paper shows that for fc > 3 VLSP-{1, k} is MAX-SNP hard, 
and that we can approximate it with ratio 2 — 4/(fe + 3). 



1 Introduction 

VLSP problem described above was introduced by Czumaj et al. [5] . A motiva- 
tion presented there is that collecting information from a web site takes amount 
of time that depends on the congestion in the network in a specific geographic 
area. Suppose that we want to collect news stories and pictures on some topics 
from sites in different countries. The congested times in Australia, Spain, Aus- 
tria etc. would fall into different times of the day. To motivate VLSP-{1, k}, we 
can make a simplifying assumption that a task/job takes one or k units of time, 
respectively in the absence or presence of the congestion. 

The first question that one could ask is if all jobs can be scheduled in a 
specified time-frame. Since for k > 2 this question is NP-complete, we must 
settle for an approximation algorithm. One function that could be maximized is 
the throughput] we would fix the time frame and maximize the number of jobs 
that can be executed. This problem is well studied (cf. [6, 2, 1,3]). Presumably, 
the unscheduled jobs would be scheduled on other processors. 

A more natural assumption is that we can sequence all the given jobs, but 
we want to complete them as early as possible. The earlier we complete these 
task, the earlier we can use the resulting synthesis of the information, and this 
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may translate into a competitive advantage. This in turn means that we are 
minimizing the makespan, as defined in the abstract. 

In this paper translate this into a graph problems related to the finding 
of the maximum matching (similarly to Czumaj et al. [5]). Then we present 
two approximation algorithms for the graph problem, our performance ratio is 
obtained by running those two algorithms and choosing the better of the two 
solutions. 

In this extended abstract we ignore the issues of the most efficient implemen- 
tation of our algorithms. While they are clearly polynomial, we hope to obtain 
better running time than implied by the brute force approach. 



2 Graph Problem 

Because we focus entirely on the version of VLSP where a job length can be 
either 1 or fc, we can present the input as a bipartite graph: J U [0,n) are the 
nodes and £ = {{j,t) : X{j,t) = 1} are the edges. In turn, a sequencing can be 
represented by a set of time slots A C [0, n) that can be matched with jobs in 
this graph. We will view a respective matching as a function M \ J . 

From such a set A C [0, n), we compute the sequencing as follows by selecting 
\J\ time intervals, where the eligible intervals have the form [a, a + k) (for arbi- 
trary a) or {a} (for a & A). We always select an interval that does not overlap 
our pervious selection, and under that proviso, that has the earliest ending. The 
pseudocode below describes this algorithm in detail. 

Compute a matching M that is an injection of A into J\ 

while ( |i?| < |/C| ) 

{ C ^ [t,t + k) A\ 
if ( C yf 0 ) 

t ^ min(C'), 

^ t, 

{M(t)}, 

t < — t -|- 1; 

else 

B ^ BU{t}, 
t ^ t + k] 

} 

while { K. ^ 0 ) 

j ^ delete_any(/C), 
ct(j') ^ delete_min(iJ); 



We will use to denote the final value of t in the above algorithm, i.e. the 
makespan of the resulting sequencing. Now we can rephrase our objective as 
follows: 
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A is legal if there exists a matching with domain A; find a legal A that 
minimizes 

In the next section we show that this problem is MAX-SNP hard for fc > 2. 
Thus we must settle for a less ambitious goal, namely to minimize ratio r > 1 
such that our polynomial time algorithm finds A that satisfies /t* < r, where 
t* denotes the minimal makespan. 



3 MAX-SNP-Completeness 

We will show that VLSP-{1,3} is MAX-SNP, which proves that we cannot ap- 
proximate it with a ratio better than some constant r > 1, unless P = NP. 



Theorem 1. VLSP-{1,3} is MAX-SNP hard. 

Proof. We will show a reduction of 3-MIS to VLSP-{1,3}. 3-MIS is a problem 
of finding a maximum independent set in a graph where each node has three 
neighbors. A proof of MAX-SNP completeness of 3-MIS can be found in [4]. 

Consider an instance of 3-MIS. We may assume that it is a graph with nodes 
[0,2n) and a set of edges E = {a : i G [0,3n)}. For each node we number the 
incident edges from 0 to 2, so fc-th edge incident to v is einc{v,k)- 

We define the corresponding instance of VLSP-{I,3} as follows. The set of 
jobs will he J = [0,6n). As we observed in the previous section, it suffices to 
define S, the set of pairs (j,t) such that X{j,t) = 1- For each node v G [0,2n), 
u is a node job and set S contains (u,4v -|- 3). For each edge Cj € S, 2n -\- j is 
an edge job and set S contains two pairs; in particular, if inc{v, k) = Cj then £ 
contains (2n -I- j, M k). For jobs in [5n, fin) set £ contains no pairs. 

We first observe that we have 5n jobs that potentially can be sequenced with 
a time interval of length 1 (node and edge jobs) and a set of n jobs, [5n, fin), that 
must be sequenced with a time interval of length 3. Therefore the makespan is 
at least 8n. 

Before we proceed, we will prove the following lemma which will be also 
useful later. 

Lemma 1. Consider an instance of VLSP-{l,k} with job set J and edge set 
£. Assume that that £ C fL x [0,t) and that t* > t. Then for some A* that is a 
domain of a maximum matching we have t* = t^ . 

Proof. For some legal A C [0, t) we have t* = If A is a domain of a matching 
M, we can augment M to a maximum matching M* with domain A* D A. 
Observe that the algorithm that computes t^ assures that t^ < t^, thus t^ = 
t*. □ 

In our instance of VLSP-{I,3} a maximum matching matches all the node 
and edge jobs, because each i G [0, 8n) belongs to exactly one edge of £. Consider 
a maximum legal A C [0, n). Because node job 2n — 1 can be matched only by 
8n — 1, the algorithm that computes the sequencing and t^ has a state when 
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t = 8n. Let Bq be B at that moment; obviously, at the same moment we have 
/C = [5n,6n). Thus we need to perform n — |i?o| steps that increase |i?| from k 
to n and t from 8n to 8n + 3(n — |i?o|) = — 3|-Bo|- 

Because maximum legal A contains a match of job 0, it contains 3. Therefore 
Bo cannot contain 1 (as [1, 1 + 3) n A yf 0), 2 or 3. By extending this reasoning 
to other nodes, we can see that Bq C {4u : v G [0,2n)}. We can define a set 
of nodes J = {v : 4:V G Bq}- We can show that J is independent. Suppose 
that V and w are neighbors, then for some edge Cj and k,l G [0,3) we have 
j = inc{v,k) = inc{w,l). Consequently, for the edge job 2n + j set A must 
contain either 4v + fc or 4w + I, thus one of 4v and 4w is not in Bq and one of v 
and w is not in J. 

Summarizing, if we have legal A that yields < lln — 3i, we can extend it 
to a maximum legal A and find the respective Bq and J of size at least i. This 
is the solution transformation, 

One can also easily show that if J C [0,2n) is a maximal independent set, 
then we can find a legal A such that for Bq defined as above we have Bq = 
{4v : u G J} and thus = lln — 3| J|. Thus if we compute a solution for our 
VLSP-{1,3} instance with some error, we obtain the same absolute error size 
for the original 3-MIS instance. Because | J| > n/2, the relative error obtained 
for the 3-MIS instance is at most (11 — 3/2) /(n/2) = 19 times larger. This 
shows that our construction is an approximation preserving reducibility. and 
thus VLSP-{1, 3} is MAX-SNP hard. □ 

4 An Algorithm with [{2k + l)t* — kn]/{k + 1) Bound 

Our algorithm for VLSP-{l,fc} has the following form. We will run two algo- 
rithm, each with different performance guarantee. Then we take the better of 
the two results. 

In the analysis, we will assume that \ J\ = n. 

The first algorithm is the following. We try to guess t*, so we use a for 
loop with t ranging through all possible values of t* , i.e. from n to kn. For a 
given t, we can find the maximum size m of a matching with domain in [0, t). If 
m + k{n — m) > t, we know that t < t* . Otherwise, we can hope that t = t*. If 
it is so, then by Lemma 1, we know that a legal set A* C [0, t) with m elements 
provides the optimum solution, i.e. = t In particular, when we run our 
algorithm that computes the sequencing from A* , then when we reach t = t we 
will have B equal to some B* that has n — m elements. 

Suppose now that we guessed correctly B*, but we do not know A*. Then 
we can find A* as the maximum legal set that is contained in free{B*), where 

occupied(B) = [j,j + k), free{B) = [0,t) — occupied{B). 

jeB 

Thus finding A* from t and is a maximum matching problem. 

This allows us to define what a good set B is: \occupied{B)\ = k\B\ and 
free{B) contains a maximum legal subset of [0, t). As we have seen in the previous 
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section, finding a good B of the maximum size is MAX-SNP hard. However, we 
can find an approximate solution. 

One can try various methods, and there seems to be a trade-off between the 
approximation ratio and the running time. In this preliminary version, we will 
analyze just one algorithm for approximating the maximum good B: 

As long as possible, enlarge B by inserting a new elements, or by remov- 
ing one element and inserting two. 

Summarizing, the algorithm of this section runs t through all possible values 
of t*, for each finds the size of a maximum legal set contained in [0, t) and then 
searches for a large good set B using the above method. 

Lemma 2. If our method finds a good set B, and B* is the maximum good set, 
then |i?*|/|i3| < (fc -I- l)/2. 

Proof. Let M and M* be maximum matchings with domains contained in 
free{B) and free{B*) respectively. Because we have verified that B is good, we 
know M. Among various possible matchings M*, for the analysis we pick one 
that minimizes \M — M*\. 

Consider a connected component C of M U M* . Because every node (time 
moment or a job) belongs to at most one edge in M and at most one edge in 
M*, C is a simple path or a simple cycle. Suppose C is a cycle, then we can 
change M* into M* 0C and \M — M*\ decreases (0 is a symmetric difference), a 
contradiction. Suppose that C is a path with an odd number of edges, then it is 
an augmenting path that increases M or M* , a contradiction because these are 
maximum matchings. Suppose that (7 is a simple path with endpoints in fl, then 
we can change M* into M* 0 C without changing its domain, such modified M* 
is still consistent withi?*,but \M — M*\\s smaller , a contradiction . We conclude 
that C is a simple path with both endpoints in [0,t*). 

A conflict between B and B* is a pair (a, a*) such that a G occupied(B), a* G 
occupied{B*) and either a = a* or these two time moments are two endpoints 
of a connected component of M U M* . If a G [j,j + k) for some j G B and 

G b*ji* + k) for some j* G B* , we also say that this is conflict between j 
and j* . 

It is easy to see that conflicts are disjoint, i.e. if (a, a*) yf (a, o*), then a yf a 
and a* yf a*. Consider j* G B* . If j* has no conflict with any j G B, then we 
would enlarge B by inserting j* . If j* has conflict with only one element of B, 
then we say that this conflict is acute. 

We give 2 debits to each j* G B* and k + 1 credits to each j G B. To prove 
the lemma, it suffices to show that there is at least as many credits as there are 
debits. 

For every conflict between j G B and j* G B* we remove debit from j* and 
one credit from j. Because each j G B started with k + 1 credits, and [j,j + k) 
contains k time moments, such a j must still possess at least one credit. In turn, 
j* still possess a debit only if it has an acute conflict with some j G B. Now it 
suffices to show that there can be no two acute conflicts with the same element 
of B. 
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Suppose that jg and have acute conflicts with the same j G B. Then we 
would enlarge B by removing j and inserting jg and j* , a contradiction, because 
this step would be performed before we have terminated the computation of B. 

□ 

Now we can calculate the performance guarantee of this algorithm. Suppose 
that m = \B*\. Because when we reach t = t* we have |i?| > 2m/{k + 1), we will 
obtain a makespan not larger than t* + {l — 2/{k+l))km = t* + km{k—l) /{k+1). 
In turn, we know that n — m + km < t* , thus (fc — l)m < {t* — n) and thus we 
can bound the makespan with t* + {t* — n)k/{k+ 1 ) = [(2k+ l)t* — kn]/{k+ 1 ). 

5 An Algorithm with (t* + kn)/2 Bound 

Let us say that a legal set A is succinct if for every proper subset A' C A we 
have . The algorithm described in this section will be constructing a 

succinct legal set. 

The trick of our method is the following: suppose that A = {oi, 02 , ..., 02 /} 
where the sequence of Uj’s is increasing. Let A = ( 02 , 04 , . . . , 02 /). If A is a 
succinct legal set, then we say that A is a legal prototype. Before we proceed, we 
need to show that for a given A we can check if it is a legal prototype, and if 
indeed it is, we can And the corresponding succinct legal set. 

Lemma 3. For a given increasing sequence A = { 02 , 04 , . . . , 02 ;} we define 

Og = 0, 

Si = {oi} if i is even and Si = [a^-i + 1, Ui+i) if i is odd, 

Vi = |S'i| mod k, 
di = (IS'il - ri)/k, 

Ti = {oi-i + dk + r \ 0 < d < di and 1 < r < rt}, 

Hi = n - {di + 1) - . . . - {di + 1), 

Moreover, T = {T\, . . . ,T 2 i\ and G{A) is a bipartite graph with node set 
T \J J and edge set { {T,j} : \{j,a) = 1 for some a G T}. Then A is a legal 
prototype if and only if 

1. n yf 0 for i = 1, ... ,21 

2. G{A) has a matching with domain T , and 

3. U2i > 0 

Proof. To see that all three conditions are necessary, consider a succinct legal 
set A = {oi , . . . , 021 . Clearly, every Oi G Si. Define r G [0, k) and d by equality 
Qi = Qi-i + dk + r. If the sequencing computed from A executes di or less jobs 
within Si, then we can assign to each of these jobs a time interval of size k, so 
we can remove Oi from A and t^ remains unchanged; a contradiction, because A 
is succinct. Therefore we can run di + 1 jobs within Si, because only one of them 
uses a time interval of size 1, namely {ai}> the remaining jobs have intervals of 
length k. Suppose d of these intervals belong to [oi-i + l,Oi) and di + 1 — d 
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belong to [a^ + 1, Ui+i). Define r by equality = Oi-i +dk + r. Because we can 
fit d time intervals of size k in [a^-i + 1, a^-i + dk + r), yf 0. Similarly, because 
we can fit dj — d intervals of size k in [a^-i + dk + r + 1, Ui-i + dik + + 1), 

r < Vi. Consequently, G T^. Thus a matching with domain A in the graph 
defining our VLSP-{1,3} instance induces a matching with domain T in G{A). 

To see that the third condition is satisfied as well, observe that we run di + 1 
jobs within each Si. If U 2 i were negative, that would mean that we have run 
more than n jobs, a contradiction. In other words, we have run at least n jobs 
with [1, 02 i), so we can remove 02 / from A, a contradiction because A is succinct. 

Now we can show that the conjunction of our three conditions is sufficient: 
Given a matching with domain T we obtain a matching with domain A such 
that |A n Til = 1 for t = 1, . . . , 2L Our above calculations show that because 
the first and the last conditions are satisfied, we can run di + 1 within each Si. 
Because each Si has fewer than (di + l)k elements, removing ai from A increases 
so A is a succinct legal set, and thus d is a legal prototype. □ 

We compute a legal prototype in the following greedy manner: 

I ^ 0 

for ( t ^ 0; t < kn; t ^ t + l ) 

if ( A U {t} is a legal prototype ) 
l^l + l 

a2i ^ t 

Our lemma shows that we can find a legal set that corresponds to this pro- 
totype. However, if U 2 i > 0, we must run U 2 i jobs after time moment 02 ; and 
it is possible that we can decrease by inserting yet another element to A. 
To check if indeed it is the case, we define d 2 i+i = ri 2 i — 1 and try to find the 
smallest possible r 2 i+i such that after after adding T 2/+1 to T (with the same 
definition as for other T^’s), we still can find a matching with T. If we find such 
an r 2 i+i, then a matching with domain T corresponds to a legal set A such that 
= a 2 i + {u 2 i — l)fc + Ti] if no such r 2 z+i exists, we get = a 2 i + U 2 ik. 

We finish the analysis using an inductive argument. We define A(i) to be a 
succinct legal set that for g = 1, . . . , i satisfies \A{i) n 5^1 = \A{i) n Tj| = 1 and 
under that limitation, is minimal; we also use ti to denote Our goal 

is to show that our algorithm finds A such that < {t* + kn)l2. With our 
notation, we can rephrase it as 

— ao < (to — ao + kno)/2. 

Our inductive claim is that we find A such that 

- Ui< {ti - tti + krii)/2, 

and we show it for i = 21,21 — 2, ... ,2,0. 

For i = 21 our algorithm finds A such that = t 2 i , so the claim follows from 
the fact that to ~ 02 ; < ku 2 i. 

Assume now that we have proven our claim for i = 2h. To show that it holds 
also for i = 2h — 2, consider A{2h — 2). Our case analysis depends on the size of 
A = A n {T2h-i U T2h). 
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If |A| = 2, say A = aj^}, then {02, 02/1-2} U {02;^} is a legal 

prototype, and o^^ < a2h- Because the greedy algorithm chooses the least pos- 
sible extension of the current legal prototype, we have a2h = 02/1 > thus 

t2h-2 = t-2h- By inductive assumption. 



— 02/1-2 = (a2h — 02 / 1 - 2 ) + {hh — 02 /t -I- A:n2/i)/2 = 



{t2h — 02/1-2 + kn2h-2)l2 + (o2/i — 02/i-2)/2 — k{n2h-2 ~ '02/i)/2 = 

{t 2 h -2 — 02/1-2 + kn 2 h- 2 ) + {kd, 2 h-l + f 2 h-l + l)/2 — k{d, 2 h-l + 2)/2 = 

(^2/1-2 — 02/1-2 + ^02/1-2)72 + (r2/i-l + l — 2/c)/2 < (^2/1-2— 02/1-2 + ^0-2/1-2)72 

Observe that if |A| > 2, where the two smallest elements of A are 02^_j^ and 
O2/,, then we get o^^ < 02/1, which is not possible with our greedy choice of a2h- 
Our second observation is that in the last inequality we used the fact that 
r2h-i + 1 < 2/c. The latter inequality is still valid if we increase the left hand 
side by k. Thus the above chain of inequalities can be used even if we increase 
the estimate of t2h from ^2/1-2 to ^2/1-2 + k. 

If |A| = 0, then the sequencing that produces the makespan 0/i-2 follows 
a job executed during time slot 02/1-2 with c?2/i-i + 1 jobs that are executed in 
k time slots each. Our choice of a2h enables us to execute only c?2/i-i such jobs 
before 02/1 + 1, so we can obtain a sequencing that is consistent with the selection 
of a2h by executing one of these jobs at the very end. Clearly, this increases the 
makespan by exactly fc, hence O/i + O/1-2 + k. 

If |A| = I, then we can denote the sole element of A with oJ^_]^. We consider 
two subcases. If 02^_;^ € T2/1-1, then the job scheduled in this time slot still can 
be scheduled after we have fixed T2/1-1 and T2/1. Thus the only possible mistake 
is that the schedule obtained from A{ 2 h — 2) starts d2h-i + 1 jobs of length k 
during S2h-i U S2h while in a schedule obtained from A( 2 h) we start only ^2/1-1, 
so we may be forced to execute one of these jobs at the very end. Thus it remains 
to consider the case of ^ J2h-i- It means that for some 0 < d < d,2h-i 

and r2h-i < r < k we have = 02/1-2 + dk + r. Once can see that in this 

case the sequencing that is generated by A starts at most ^2/1-1 jobs of length k 
during [02/1-1 + lj02/i, so all of them can also be scheduled after we committed 
ourself to 02/1. Thus as in the other subcase we need to reschedule to the very 
end only the job (one that we would otherwise schedule in time slot 02^_]^). 
This completes the analysis of the performance of our second algorithm. 



6 Summary 

Simple arithmetic shows that if we take the minimum of the performance guar- 
antees for our two algorithm, and divide by t*, then the worst case ratio occurs 
for t* = {k + 3 )kn/{ 3 k + 1) and it is equal to 2 — 4 /{k + 3). If we compare with 
the results of Czumaj et. al., for k = 3 we decreases the approximation ratio 
from 573 to 473, and for fc = 4 the decrease is from 1 /A to 1077. 
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Several open problems remain. One is to find if our first algorithm can run 
in time 0{\S\ x J\). Other is to extend the results to VLSP-{1, 2, . . . , fc}. On 
the practical side, one should investigate if the real-life version of the problem, 
i.e. the data coming from the downloading time of the web pages from different 
geographic zones, has some general characteristics that makes the problem easier, 
either easier to approximate, or even to compute in polynomial time. Intuitively, 
the periods of high and low congestion should have some dependencies, e.g several 
different parents rotated around 24 hour cycle. 
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Abstract. Motivated by the problem of WDM routing in all-optical 
networks, we study the following NP-hard problem. We are given a di- 
rected binary tree T and a set R of directed paths on T. We wish to 
assign colors to paths in R, in such a way that no two paths that share 
a directed arc of T are assigned the same color and that the total num- 
ber of colors used is minimized. Our results are expressed in terms of 
the depth of the tree and the maximum load I of R, i.e., the maximum 
number of paths that go through a directed arc of T. 

So far, only deterministic greedy algorithms have been presented for the 
problem. The best known algorithm colors any set R of maximum load 
I using at most 51/3 colors. Alternatively, we say that this algorithm 
has performance ratio 5/3. It is also known that no deterministic greedy 
algorithm can achieve a performance ratio better than 5/3. 

In this paper we define the class of greedy algorithms that use random- 
ization. We study their limitations and prove that, with high probability, 
randomized greedy algorithms cannot achieve a performance ratio better 
than 3/2 when applied to binary trees of depth and 1.293 — o(l) 

when applied to binary trees of constant depth. 

Exploiting inherent properties of randomized greedy algorithms, we ob- 
tain the first randomized algorithm for the problem that uses at most 
71/5 + o{l) colors for coloring any set of paths of maximum load I on 
binary trees of depth with high probability. We also present an 

existential upper bound of 71/5 + o{l) that holds on any binary tree. 

In the analysis of our bounds we use tail inequalities for random vari- 
ables following hypergeometrical probability distributions which may be 
of their own interest. 
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1 Introduction 

Let T{V,E) be a directed tree, i.e., a tree with each arc consisting of two op- 
posite directed arcs. Let i? be a set of directed paths on T. The path coloring 
problem is to assign colors to paths in R so that no two paths that share a 
directed arc of T are assigned the same color and the total number of colors 
used is minimized. The problem has applications to WDM (Wavelength Di- 
vision Multiplexing) routing in tree-shaped all-optical networks. In such net- 
works, communication requests are considered as ordered transmitter-receiver 
pairs of network nodes. WDM technology establishes communication by finding 
transmitter-receiver paths and assigning a wavelength to each path, so that no 
two paths going through the same fiber are assigned the same wavelength. Since 
state-of-the-art technology [20] allows for a limited number of wavelengths, the 
important engineering question to be solved is to establish communication so 
that the total number of wavelengths used is minimized. 

The path coloring problem in trees has been proved to be NP-hard in [5], thus 
the work on the topic mainly focuses on the design and analysis of approximation 
algorithms. Known results are expressed in terms of the load I of R, i.e., the 
maximum number of paths that share a directed arc of T. An algorithm that 
assigns at most 21 colors to any set of paths of load I can be derived by the work of 
Raghavan and Upfal [19] on the undirected version of the problem. Alternatively, 
we say that this algorithm has performance ratio 2. Mihail et al. [16] give an 
15/8 upper bound. Kaklamanis and Persiano [11] and independently Kumar and 
Schwabe [15] improve the upper bound to 7/4. The best known upper bound is 
5/3 [12]. 

All the above algorithms are deterministic and greedy in the following sense: 
they visit the tree in a top to bottom manner and at each node v color all paths 
that touch node v and are still uncolored; moreover, once a path has been colored, 
it is never recolored again. In the context of WDM routing, greedy algorithms 
are important as they are simple and, more importantly, they are amenable of 
being implemented easily and fast in a distributed environment. Kaklamanis et 
al. [12] prove that no greedy algorithm can achieve better performance ratio than 
5/3. 

The path coloring problem on binary trees is also NP-hard [6]. In this case, 
we express the results in terms of the depth of the tree, as well. All the known 
upper and lower bounds hold in this case. Simple deterministic greedy algorithms 
that achieve the 5/3 upper bound in binary trees are presented in [3] and [10]. 
The best known lower bound on the performance ratio of any algorithm is 5/4 
[15], i.e., there exists a binary tree T of depth 3 and a set of paths R of load I 
on T that cannot be colored with less than 51/4 colors. 

Randomization has been used as a tool for the design of path coloring al- 
gorithms on rings and meshes. Kumar [14] presents an algorithm that takes 
advantage of randomization to round the solution of an integer linear program- 
ming relaxation of the circular arc coloring problem. As a result, he improves 
the upper bound on the approximation ratio for the path coloring problem in 
rings to 1.37 -|-o(l). Rabani in [18] also follows a randomized rounding approach 
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and presents an existential constant upper bound on the approximation ratio for 
the path coloring problem on meshes. 

In this paper we define the class of greedy algorithms that use randomization. 
We study their limitations proving lower bounds on their performance when 
they are applied to either large or small trees. In particular, we prove that, with 
high probability, randomized greedy algorithms cannot achieve a performance 
ratio better than 3/2 when applied to binary trees of depth while their 

performance is at least 1.293 — o(l) when applied to trees of constant depth. 

We also exploit inherent advantages of randomized greedy algorithms and, 
using limited recoloring, we obtain the first randomized algorithm that colors 
sets of paths of load I on binary trees of depth using at most 71/5 + o{l) 

colors. For the analysis, we use tail inequalities for random variables that follow 
the hypergeometrical distribution. Our upper bound holds with high probability 
under the assumption that the load I is large. Our analysis also yields an exis- 
tential upper bound of 71/5 -I- o(/) on the number of colors sufficient for coloring 
any set of paths of load I that holds on any binary tree. 

The rest of the paper is structured as follows. In Section 2, we present new 
technical lemmas for random variables following the hypergeometrical probabil- 
ity distribution that might be of their own interest. In Section 3, we give the 
notion of randomized greedy algorithms and study their limitations proving our 
lower bounds. Finally, in Section 4, we present our constructive and existential 
upper bounds. 

Due to lack of space, we omit formal proofs from this extended abstract. 
Instead, we provide outlines for the proofs, including formal statements of most 
claims and lemmas used to prove our main theorems. The complete proofs can 
be found in the full version of the paper [1] . 

2 Preliminaries 

In this section we present tail bounds for hypergeometrical (and hypergeometrical- 
like) probability distributions. These bounds will be very useful for proving both 
the upper and the lower bound for the path coloring problem. Our approach is 
similar to the one used in [13] (see also [17]) to calculate the tail bounds of a 
well known occupancy problem. We exploit the properties of special sequences 
of random variables called martingales, using Azuma’s inequality [2] for their 
analysis. Similar results in a more general context are presented in [21]. 

Consider the following process. We have a collection of n balls, of which an 
are red and (1 — a)n are black (0 < a < 1). We select without replacement 
uniformly at random [3n balls (0 < /3 < 1). Let be the random variable 
representing the number of red balls that are selected; it is known that ili 
follows the hypergeometrical probability distribution [7]. We give bounds for the 
tails of the distribution of . 

Lemma 1. The expectation of is 



£[fi\\ = aPn 




Randomized Path Coloring on Binary Trees 



63 



and 

for any 7 > 0. 



Pr 



|l 7 i - > 1/2/3771 



< 2e"T', 



Now consider the following process. We have a collection of n balls, of which 
an are red and (1 — a)n are black (0 < a < 1). We execute the following two step 
experiment. First, we select without replacement uniformly at random Pin out 
of the n balls, and then, starting again with the same n balls, we select without 
replacement uniformly at random ^ 2 n out of the n balls (0 < /3i, /?2 < 1). We 
study the distribution of the random variable 122 representing the number of red 
balls that are selected in both selections. 



Lemma 2. The expectation of Q 2 is 

S[fl2] = aPiP2n 



and 



Pr 






S[Q2]\ > 2A/2min{/3i,^2}7« 



< 4e-'^, 



for any 7 > 0. 



3 Randomized Greedy Algorithms 

Greedy algorithms have a top-down structure as the algorithms presented in 
[16,11,15,12,3,10]. Starting from a node, the algorithm computes a breadth-first 
numbering of the nodes of the tree. The algorithm proceeds in phases, one per 
each node v of the tree. The nodes are considered following their breadth first 
numbering. In the phase associated with node v, it is assumed that we already 
have a partial proper coloring where all paths that touch (i.e., start, end, or go 
through) nodes with numbers strictly smaller than v’s have been colored and 
that no other path has been colored. During this phase, the partial coloring is 
extended to one that assigns proper colors to all paths that touch v but have 
not been colored yet. During each phase, the algorithm does not recolor paths 
that have been colored in previous phases. So far, only deterministic greedy 
algorithms have been studied. The deterministic greedy algorithm presented in 
[12] guarantees that any set of paths of load I can be colored with 5//3 colors. 

A randomized greedy algorithm A uses a palette of colors and proceeds in 
phases. At each phase associated with a node v, A picks a random proper coloring 
of the uncolored paths using colors of the palette according to some probability 
distribution. 

We can prove that no randomized greedy algorithm can achieve a perfor- 
mance ratio better than 3//2 if the depth of the tree is large. 

Theorem 3. Let A be a (possibly randomized) greedy path coloring algorithm on 
binary trees. There exists a randomized algorithm AVV which, on input e > 0 
and integer I > 0, outputs a binary tree T of depth I + elnl + 2 and a set R of 
paths of maximum load I on T, such that the probability that A colors R with at 
least 31/2 colors is at least 1 — exp(— 
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We can also prove the following lower bound that captures the limitations of 
randomized greedy algorithms even on small trees. 

Theorem 4. Let A be a (possibly randomized) greedy path coloring algorithm on 
binary trees. There exists a randomized algorithm AT>V which, on input 5 > 0 
and integer I > 0, outputs a binary tree T of constant depth and a set R of paths 
of maximum load I on T, such that the probability that A colors R with at least 
(1.293 — d — o(l))Z colors is at least 1 — 0{l~^). 

Furthermore, Theorem 4 can be extended for the case of randomized algo- 
rithms with a greedy structure that allow for limited recoloring like the one we 
present in the next section. 

4 Upper Bounds 

In this section we present our randomized algorithm for the path coloring prob- 
lem on binary trees. Note that this is the first randomized algorithm for this 
problem. Our algorithm has a greedy structure but allows for limited recoloring. 
We first present three procedures (namely Preprocessing Procedure, Recoloring 
Procedure, and Coloring Procedure) that are used as subroutines by the algo- 
rithm, in Sections 4.1, 4.2, and 4.3, respectively. Then, in Section 4.4, we give 
the description of the algorithm and the analysis of its performance. In partic- 
ular, we show how our algorithm can color any set of paths of load I on binary 
trees of depth o{l^^^) using at most 71/5 + o{l) colors. Our analysis also yields 
an existential upper bound on the number of colors sufficient for coloring any 
set of paths of load I on any binary tree (of any depth), which is presented in 
Section 4.5. 



4.1 The Preprocessing Procedure 

Given a set R* of directed paths of maximum load I on a binary tree T, we want 
to transform it to another set R of paths that satisfies the following properties: 

— Property 1: They have full load I at each directed arc. 

— Property 2: For every node v, paths that originate or terminate at a node 
V appear on only one of the three arcs adjacent to v. 

— Property 3: For every node v, the number of paths that originate from v 
is equal to the number of paths that are destined for v (note that property 
3 is a corollary from properties 1 and 2). 

This is done by a Preprocessing Procedure which is described in the following. 
At first, the set of paths is transformed to a full load superset of paths by adding 
single-hop paths at the directed arcs that are not fully loaded. Next the non- 
leaf nodes of the tree are traversed in a BFS manner. We consider a step of the 
traversal associated with a node v. Let R^ be the set of paths that touch v. Ry 
is the union of two disjoint sets of paths: Sy which is the set of paths that either 
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originate or terminate at v, and Py which is the set of paths that go through 
V. Pairs of paths (vi,v), (v,V 2 ) in Sy are combined and create one path (vi,V 2 ) 
that goes through v. Paths (vi,v), (v, V 2 ) are deleted by Sy and the path (wi, U 2 ) 
is inserted to Py. 

Lemma 5. Consider a set of directed paths R* of maximum load I on a binary 
tree T. After the application of the Preprocessing Procedure, the set R of directed 
paths that is produced satisfies properties 1, 2, and 3. 

It is easy to see how to obtain a legal coloring for the original pattern if we 
have a coloring of the pattern which is produced after the application of the 
Preprocessing Procedure. 

In the rest of the paper we consider only sets of paths that satisfy properties 
1, 2 and 3. Let i? be a set of paths that satisfies properties 1, 2 and 3 on a binary 
tree T, and let u be a non-leaf node of the tree. Assuming that the nodes of T 
are visited in a BPS manner, let p{v) be the parent node of v, and l{v), r{v) its 
left and the right child nodes, respectively. 





Fig. 1. The two cases that need to be considered for the analysis of path coloring 
algorithms on binary trees. Numbers represent groups of paths (number 1 implies 
the set of paths Mf, etc.). 



We partition the set Py of paths that go through v to the following disjoint 
subsets which we call groups: the group Mf of the paths that come from p(v) 
and go to l{v), the group Mf of the paths that come from l{v) and go to p{v), 
the group of the paths that come from p{v) and go to r{v), the group Mf of 
the paths that come from r{v) and go to p{v), the group of the paths that 
come from l{v) and go to r{v), and the group M® of the paths that come from 
r{v) and go to l{v). Since Py satisfies properties 1, 2, and 3, we only need to 
consider the two cases for paths in Sy depicted in Figure 1. 

— Scenario I: The paths in Sy touch node p{v): the set Sy is composed by 
the group Mf of the paths that come from p{v) and stop at v and the group 
of the paths that originate from v and go to p{v) . 
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— Scenario II: The paths in Sy touch a child node of v (wlog r(u)): the set 
Sy is composed by the group of the paths that originate from v and go 
to r{v) and the group M® of the paths that come from r{v) and stop at v. 



4.2 The Recoloring Procedure 

Let T be a binary tree with 4 nodes and of the form depicted in Figure 1. Let 
V be the node with degree 3, p{v) its parent, and l{v), r(v) its left and right 
child node, respectively. Consider a set of paths of full load I on T which is 
partitioned to groups M* as described in the previous section. 

Also consider a random coloring of the paths that traverse the opposite di- 
rected arcs {p{v),v) with tl colors (1 < t < 2). We assume that the coloring has 
been selected among all possible proper colorings according to some probability 
distribution V. Let S be the number of single colors, i.e., colors assigned only 
to one path that traverse arc (p(v),v), and D the number of double colors, i.e., 
colors assigned to two paths that traverse arc (p{v),v) in opposite directions. 
Since the load of Ry is I we obtain that D = {2 — t)l and S = 2{t — 1)L 

Definition 6. Let D = (2 — t)l. A probability distribution V over all proper 
eolorings of paths in Ry that traverse the are {p{v),v) with tl eolors is weakly 
uniform if for any two paths ri,r 2 € Ry that traverse are (p{v),v) in opposite 
directions, the probability that r\ and V 2 are assigned the same color is D/P. 



We will give an example of a random coloring of paths in Ry that traverse 
the arc (p{v ) , v) with tl colors according to the weakly uniform probability dis- 
tribution. Let D = {2 — f)l. We use a set Xi of I colors and assign them to paths 
in Ry that traverse the directed arc (p(v),v). Then, we define the set X 2 of colors 
as follows. We randomly select D colors of Xi and use I — D additional colors. 
For the paths in Ry that traverse the directed arc (v,p{v)), we select uniformly 
at random a coloring among all possible colorings with colors of A 2 . 

Let C be a coloring of paths in Ry that traverse the arc (p{v) , v) . We denote 
by Ai the set of single colors assigned to paths in M*, and by Aij the set of 
double colors assigned to paths in groups M/ and M/. Clearly, the numbers \Ai\ 
and \Aij \ are random variables following the hypergeometrical distribution with 
expectation 






\Ml\S 

21 



and f [|Aij|] 



\Ml\\M/,\P 

P 



In the following we use i and ij for indexing of sets of single and double 
colors, respectively. In addition, we use the expressions “for any i” and “for any 
pair we use them as shorthand for the phrases “for any i such that paths 
in group M/ traverse the arc (p(v),v) in some direction” and “for any pair i,j 
such that paths in groups M* and M/ traverse the arc (p{v),v) in opposite 
directions” , respectively. 



Definition 7. Let D = (2 — t)l. A probability distribution V over all proper 
coloring of paths in Ry that traverse the arc (p{v),v) with tl colors satisfying 






D\Ml\\M/\ 

P 
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for any pair i,j, is strongly uniform if for any two paths r\,r 2 G Rv that traverses 
arc (p(v),v) in opposite directions, the probability that r\ and V 2 are assigned the 
same color is D/R . 



Since, for any i, it is |M*| = \ Ai\ + ^- \ Aij\, we obtain that a coloring chosen 
according to the strongly uniform probability distribution satisfies 






\Ml\S 

21 



Assume that we are given a tree T of 4 nodes (and of the form depicted in 
Figure 1) and a set of paths Ry of full load I on T as described, and a random 
coloring C of the paths in Ry that traverse the arc {p{v ) , v) with tl colors chosen 
according to the weakly uniform probability distribution. The Recoloring Pro- 
cedure will recolor a set Ry C Ry of paths that traverse the arc (p{v),v) so that 
the coloring C of paths in Ry that traverse the arc {p(y) , u) is a random coloring 
with tl colors chosen according to the strongly uniform probability distribution. 

We now give the description of the Recoloring Procedure. First, each color 
is marked with probability p. Let X be the set of marked colors that consists of 
the following disjoint sets of colors: the sets Xi = AiD X, for any i, and the sets 
Xij = Aij C\ X, for any pair i,j. 

We set 



for any i, and 



,,, |M;|5(1-/-i/3) 

m = \A^\ 



yij — \Aij\ 



\Ml\\Mi\D{l-l-R^) 

P 



( 1 ) 

(2) 



for any pair i,j. Clearly, the conditions 0 < yt < \Xi\ and 0 < yij < \Xij\ are 
necessary so that the procedure we describe in the following is feasible. For the 
moment we assume that Q <. yi < \Xi\ and 0 < ytj < |A^|. 

We select a random set Yi of yi colors of Xi, for any i, and a random set Yij 
of yij colors of Xij, for any pair i,j. Let Y be the union of all sets Yi and Yij 
and the set of paths of Ry that traverse the arc (p{v),v) colored with colors 
in Y. Using (1) and (2), adding all yfs and yij's we obtain 



\Y\ = T.y^ + T.y^j = ti^^\ 



while the load of in the opposite directed arcs between p{v) and v is 



* 3 ij 



where i and j takes such values that paths in groups M* traverse the directed 
arc (p(v),v) and paths in groups traverse the directed arc (v,p{v)). The Re- 
coloring Procedure ends by producing a random coloring C" of paths in with 
the tpA colors of Y according to the strongly uniform probability distribution. 
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Now, our argument is divided in two parts. First, in Claim 8, we show that, 
if yi and are of the correct size and C" is a random coloring of paths in 
that traverse the arc {p{v),v) with colors according to the strongly 
uniform probability distribution, then the new coloring C of the paths in 
that traverse the arc {p{v),v) is a random coloring with tl colors according to 
the strongly uniform probability distribution. Then, in Lemma 9, we show that, 
for t = 6/5 and sufficiently large values of I and the cardinality of each group 
Ml, it is possible to fix the marking probability p so that pi and yij have the 
correct size, with high probability. 

Claim 8. If 0 < Pi < \Xi\ for any i, and 0 < pij < \Xij\ for any pair i,j, 
the Recoloring Procedure produces a random coloring C of paths in R^ that tra- 
verse the arc {p{v),v) with tl colors according to the strongly uniform probability 
distribution. 

In the following we concentrate our attention to the case t = 6/5 which is 
sufficient for the proof of the upper bound. The following lemma gives sufficient 
conditions for the correctness of the Recoloring Procedure. It can be adjusted 
(with a modification of the constants) so that it works for any t G [1,2]. 

Lemma 9. Let 0 < 5 < 1/3 be a constant, t = 6/5, p = 51“^/^, and I > 125. If 
each non-empty group Ml has cardinality at least 

max{6.25l^/^+^ 1.16?®/®+'^/^} 

then the Recoloring Procedure is correct with probability at least 1— 63exp(— /*^/8). 

For proving the lemma, we use the Chernoff-Hoeffding bound [4,9] together 
with the tail bounds for hypergeometrical probability distributions (Lemmas 1 
and 2) to prove that the hypotheses of Claim 8 are true, with high probability. 

Remark. The numbers yt and pij given by the equations (1) and (2) must be 
integral. In the full version of the paper [1] , we prove additional technical claims 
that help us to handle these technical details by adding paths to the original set 
of paths, increasing the load by an o{l) term. 



4.3 The Coloring Procedure 

Assume again that we are given a tree T of 4 nodes (and of the form depicted in 
Figure 1) and a set of paths R^ of full load ^ on T as described in the previous 
section, and a random coloring of the paths in Ry that traverse the arc (p(v), v) 
with 6//5 colors according to the strongly uniform probability distribution. The 
Coloring Procedure extends this coloring to the paths that have not been colored 
yet, i.e., the paths in Ry that do not traverse the arc (p{v),v). The performance 
of the Coloring Procedure is stated in the following lemma. 
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Lemma 10. The Coloring Procedure colors the paths in that have not been 
colored yet, in such a way that at most 71/5 total colors are used for all paths in 
Rv, that the number of colors seen by the arcs (v,l{v)) and (v,r{v)) is exactly 
51/5, and that paths in Ry traversing these arcs are randomly colored according 
to the weakly uniform probability distribution. 

According to Section 4.1, for the Coloring Procedure we need to distinguish 
between Scenarios I and II. The complete description of the Coloring Procedure 
and the proof of Lemma 10 can be found in [1]. 



4.4 The Path Coloring Algorithm 

In this section we give the description of our algorithm and discuss its perfor- 
mance. In particular, we prove the following theorem. 

Theorem 11. There exists a randomized algorithm that, for any constant S < 
1/3, colors any set of directed paths of maximum load I on a binary tree of 
depth at most 1^/8, using at most 71/5 + o{l) colors, with probability at least 
1 — exp (— . 

Let T be a binary tree and R* a set of paths of load I on T. Our algorithm 
uses as subroutines the Preprocessing Procedure, the Recoloring Procedure and 
the Coloring Procedure described in the previous sections. 

First, we execute the Preprocessing Procedure on R* and we obtain a new 
set R of paths that satisfies properties 1, 2, and 3, as described in Section 4.1. 

Then, the algorithm roots the tree at a leaf node r{T), and produces a ran- 
dom coloring of the paths that share the opposite directed arcs adjacent to r(T) 
with exactly 51/5 colors. This can be done by assigning I colors to the paths that 
originate from r(T), randomly selecting 4//5 from these colors, and randomly as- 
signing these colors and 1/5 additional colors to the paths destined for r(T). Note 
that, in this way, the coloring of the paths that share the opposite directed arcs 
adjacent to r(/T) is obviously random according to the weakly uniform probabil- 
ity distribution. Then, the algorithm performs the following procedure Color 
on the child node of r{T). 

The procedure Color at a node v takes as input the set of paths that 
touch V together with a random coloring of paths that traverse the arc (p(v), v) 
with 51/5 colors according to the weakly uniform probability distribution. The 
procedure immediately stops if u is a leaf. Otherwise, the Recoloring Procedure 
is executed producing a random coloring of paths traversing arc (p(y ) , v) with 
51/5 colors according to the strongly uniform probability distribution. We denote 
by R/ the set of paths that are recolored by the Recoloring Procedure. Then, 
the Coloring Procedure is executed producing a coloring of the paths in that 
have not been colored yet, using at most 71/5 colors in total, in such a way 
that the number of colors seen by the opposite directed arcs between v and 
r{v) and the opposite directed arcs between v and l{v)) is exactly 51/5 and that 
the colorings of paths traversing these arcs are random according to the weakly 
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uniform probability distribution. The procedure recursively executes Color for 
the child nodes of v, r(v) and l{v). 

After executing procedure Color recursively on every node of T, all paths 
in R have been properly colored except for the paths in the set these 

were the paths recolored during the execution of the Recoloring Procedure at 
all nodes. Our algorithm ends by properly coloring the paths in UyR^; this can 
be done easily with the greedy deterministic algorithm using at most o{l) extra 
colors because of the following Lemma 12. 

Lemma 12. Let 0 < S < 1/3 be a constant. Consider the execution of the 
algorithm on a binary tree of depth at most 1^/8. The load of all paths that are 
recolored by the Recoloring Procedure is at most with probability at least 

1 — exp {-f2(pC+5^). 

For proving Lemma 12, we actually prove the stronger statement that, with 
high probability, the load of the paths that are marked (which is a superset of 
the paths that are recolored) is at most 2^/^+*^. 



4.5 An Existential Upper Bound 

The analysis of the previous sections implies that, when the algorithm begins 
the execution of the phase associated with a node v, with some (small) positive 
probability, the numbers of single and double colors is equal to their expectation. 
We may assume that the coloring of paths traversing the arc (p(v ) , v) is random 
according to the strongly uniform probability distribution. Thus, with non-zero 
probability, the execution of the Recoloring Procedure is unnecessary at all steps 
of the algorithm. Using this probabilistic argument together with additional 
technical claims, we obtain an existential upper bound on the number of colors 
sufficient for coloring any set of paths of load I . 



Theorem 13. Any set of paths of load I on a binary tree can be colored with at 
2 



most 






+ 10 I colors. 



Note that Theorem 13 holds for any binary tree (of any depth). Especially 
for binary trees of depth o{C^^), it improves the large hidden constants implicit 
in the o{l) term of our constructive upper bound (Theorem 11). In larger binary 
trees, it significantly improves the 5/3 constructive upper bound for all sets of 
paths of load greater than 26, 000. 
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Abstract. Wavelength rerouting has been suggested as a viable and 
cost-effective method to improve the blocking performance of wavelength- 
routed Wavelength-Division Multiplexing (WDM) networks. This method 
leads to the following combinatorial optimization problem, dubbed Vene- 
tian Routing. Given a directed multigraph G along with two vertices s 
and t and a collection of pairwise arc-disjoint paths, we wish to find an 
st-path which arc-intersects the smallest possible number of such paths. 

In this paper we prove the computational hardness of this problem even 
in various special cases, and present several approximation algorithms 
for its solution. In particular we show a non-trivial connection between 
Venetian Routing and Label Cover. 

1 Introduction 

We will be concerned with an optimization problem called Venetian Routing 
(VR), and with some of its relatives. (There is an interesting analogue with 
routing boats-Gondolas-in the city of Venice, which motivates the name; this 
analogue will be discussed in the full version of this paper.) The input to VR 
consists of a directed multigraph G, two vertices s and t of G, respectively called 
the source and the sink, and a collection of pairwise arc-disjoint paths in G. 
These paths will be referred to as special or laid-out paths. The goal is to find 
an st-path which arc-intersects the smallest possible number of laid-out paths. 
(Recall that G is a multigraph. If ei and 62 are parallel arcs, it is allowed for 
one laid-out path Q\ to contain Ci, and for another laid-out path Q2 to contain 
62. But Qi and Q2 cannot both contain ei.) 

This problem arises from wavelength rerouting in Wavelength-Division Multi- 
plexing (WDM) networks. In WDM networks, lightpaths are established between 
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pairs of vertices by allocating the same wavelength throughout the path [3]: two 
lightpaths can use the same fiber link if they use a different wavelength. When 
a connection request arrives, a proper lightpath (i.e., a route and a wavelength) 
must be chosen. The requirement that the same wavelength must be used in all 
the links along the selected route is known as the wavelength continuity con- 
straint. This constraint often leads to poor blocking performance: a request may 
be rejected even when a route is available, if the same wavelength is not available 
all along the route. To improve the blocking performance, one proposed mech- 
anism is wavelength rerouting: whenever a new connection arrives, wavelength 
rerouting may move a few existing lightpaths to different wavelengths in order to 
make room for the new connection. Lee and Li [10] proposed a rerouting scheme 
called “Parallel Move-To- Vacant Wavelength Rerouting (MTV-WR)”. Given a 
connection request from s to t, rather than just blocking the request if there is 
currently no available lightpath from s to t, this scheme does the following. Let 
A{W) denote the set of current lightpaths that use wavelength W and that may 
be migrated to some other wavelength. (That is, for each Q G A(W), there is 
some W yf W such that no lightpath with wavelength W' uses any arc of Q.) 
Separately for each W, we try to assign the new connection request to wave- 
length W. To this aim, we solve VR on the sub-network obtained by deleting all 
wavelength-W lightpaths not lying in A{W)] the set of laid-out paths is taken to 
be A{W). (Since all lightpaths with the same wavelength must be arc-disjoint, 
the “arc-disjointness” condition of VR is satisfied.) If there is no feasible solution 
to any of these VR instances, the connection request is rejected. Otherwise, let 
P be an st-path representing the best solution obtained over all W. By migrat- 
ing each lightpath arc-intersected by P to an available wavelength, we obtain 
an st-path with minimal disruption of existing routes. In the case of directed 
multi-ring network topologies, such an approach achieves a reduction of about 
20% in the blocking probability [11]. 

Thus, VR naturally arises in such rerouting schemes, helping in dynamic 
settings where future requests cannot be predicted and where one wishes to 
accommodate new requests reasonably fast without significantly disrupting the 
established lightpaths. VR is also naturally viewed as finding a minimum number 
of laid-out paths needed to connect s to t. If the different laid-out paths are leased 
out by different providers, this problem of setting up a connection from s to t at 
minimum cost (assuming unit cost per laid-out path) is modeled by VR. Lee and 
Li gave a polynomial-time algorithm for VR on undirected networks [10]. This 
algorithm was later improved in [12]. To the best of our knowledge, no result 
(including the possible hardness of exactly solving VR) was known for general 
directed networks. 

A natural generalization of VR is to associate a priority with each laid-out 
path, where we wish to find an st-path P for which the sum of the priorities 
of the laid-out paths which arc-intersect P is minimized; we call this PVR. An 
important special case of PVR is Weighted VR or WVR, where the priority 
of a laid-out path is its length. This is useful in our wavelength rerouting con- 
text, since the effort in migrating a lightpath to a new wavelength, is roughly 
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proportional to the length of the path. Finally, one could look at VR from the 
complementary angle: finding a maximum number of laid-out paths that can be 
removed without disconnecting t from s. Naturally, this variant will be referred 
to as the Gondolier Avoidance problem (GA). 

Our Results. In this paper we study the worst-case complexity of VR and of 
its variants, and present several hardness results and approximation algorithms. 
In particular, we show an interesting connection between VR and the Symmet- 
ric Label Cover (SLC) problem [1,8,4] which allows us to obtain new upper 
bounds for the latter. In what follows, let £ denote the maximum length (number 
of arcs) of any laid-out path, and n, m respectively be the number of vertices and 
arcs in G. Let QV stand for quasi-polynomial time, i.e., Uc>o 
For lower bounds, we establish the following: 

(1.1) VR is APX-hard even when £ = 3 (actual lower bound is ^)- If ^ < 2, 
the problem is polynomially solvable. 

(1.2) When £ is unbounded things get worse, for we exhibit an approximation 

preserving (L-) reduction from SLC to VR which implies that any lower 
bound for SLC also holds for VR. In particular, using the hardness of 
approximation result for SLC mentioned in [8,4], this shows that approx- 
imating VR within 0(2^°®^ ™), for any fixed e > 0, is impossible unless 

AfV C QV. 

(1.3) We exhibit an approximation preserving (L-) reduction from Set Cover 
to VR with unbounded £. This implies two things. First, if AfV yf V, 
a weaker hypothesis than AfV % QV^ there exists a constant c > 0 
such that VR cannot be approximated within clogn if £ is unbounded. 
Second, GA is at least as hard to approximate as Independent Set. 

We also give several approximation algorithms. All algorithms reduce the prob- 
lem to shortest path computations by using suitable metrics and (randomized) 
preprocessing. We remark that almost all of our algorithms actually work for 
a more general problem than VR defined as follows. The input consists of a 
multigraph G, a source s and a sink t, and a collection of disjoint sets of arcs 
Si, , Sk, called laid-out sets. The goal is to find an sf-path which arc-intersects 
the smallest possible number of laid-out sets Si. In other words, we have removed 
the constraint that the laid-out sets be paths. We refer to this problem as Gen- 
eralized VR or GVR for short. Clearly, all previously described lower-bounds 
apply directly to GVR. 

Let SSSP(n, m) denote the time complexity of single-source single-sink short- 
est paths on an n-node, m-arc directed multigraph with non-negative lengths on 
the arcs. Given an optimization problem, let opt denote the optimal solution 
value for a given instance of the problem. We now summarize our results con- 
cerning upper-bounds; though WVR and PVR are “equivalent” to VR in the 
sense of polynomial-time approximability (see (1.4)), we sometimes include sep- 
arate results for WVR and PVR in case the algorithms are much faster than 
what is given by this equivalence argument. 

(2.1) For VR, we give a linear-time algorithm for the case £ <2 and an \£/2'\- 
approximation algorithm for the general case. The same approach yields: 
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(i) an [£/2] -approximation algorithm for PVR running in 
0{SSSP{n,m)) time, and (ii) an ^-approximation algorithm for GVR 
running in 0{m) time. 

(2.2) We give an 0(Aym/opt)-approximation algorithm for GVR. In view of 
the reduction described in point (1.2) above, this will yield an 0{^J q/opt)- 
approximation for SLG, where q and opt denote the input size and op- 
timal solution value respectively of an SLG instance. This is the first 
non-trivial approximation for SLG, to our knowledge. 

(2.3) We show that for any fixed e > 0, there is a 

approximation algorithm for VR. In particular, this shows a “separation” 
of VR (and hence SLG) from Chromatic Number, as the same ap- 
proximation cannot be achieved for the latter under the hypothesis that 
SAT cannot be solved in time 2" * ' . 

(2.4) Improving on (2.2), we give an O (^\Jm/{opt ■ -approximation al- 

gorithm for GVR, for any constant c > 0; this also yields an 
0(A/m/logn)-approximation. The algorithm, which is randomized, is 
based on a technique dubbed “sparsification” which might be useful in 
other contexts. The algorithm can be derandomized. 

(2.5) We give an 0(min{opt, y^})-approximation for WVR running in 
O(mlogm) time. 

Related Work. Recent work of [2], done independently of our work, has con- 
sidered the following red-blue set cover problem. We are given a finite set V, 
and a collection E of subsets of V. V has been partitioned into red and blue 
elements. The objective is to choose a collection C C E that minimizes the 
number of red elements covered, subject to all blue elements being covered. 
Here, C = {/i, / 2 , • . ■ , ft} covers u € P iff v S /i U /2 U • • • U /(. Letting k denote 
the maximum number of blue elements in any / S if, a 2y^k\ if |-approximation 
algorithm for this problem is shown in [2] . It is also proved in [2] that the red-blue 
set cover problem is a special case of GVR, and that there is an approximation- 
preserving L-reduction from SLG to the red-blue set cover problem, thus getting 
the same hardness result for GVR as we do in (1.2) above. Furthermore, red-blue 
set cover is shown to be equivalent to Minimum Monotone Satisfying Assignment 
in [2]. It is not clear if there is a close relationship between red-blue set cover 
and VR; in particular, the results of [2] do not seem to imply our results for VR, 
or our approximation algorithms for GVR. Also, a variant of GVR where we 
have laid-out sets and where we wish to find a spanning tree edge-intersecting 
the minimum number of laid-out sets, has been studied in [9] . 

Preliminaries. The number of vertices and arcs of the input multigraph G will 
be denoted by n and to, respectively. Given any VR instance, we may assume 
without loss of generality that every node is reachable from s, and t can be 
reached from each node, implying to > n if we exclude the trivial case in which 
G is a path. The maximum length of a laid-out path will be denoted by and 
the number of laid-out paths by p. As G is a multigraph, in principle to may 
be arbitrarily large with respect to n. In fact, for each pair of vertices u, v, we 
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can assume that there is at most one arc uv not belonging to any laid-out path, 
whereas the other arcs of the form uv will belong to distinct laid-out paths. 
Hence, we can assume m < n? + ip < n{n + p), and that the size of a VR (or 
WVR) instance is always 0{m). G will always be a (simple) graph in our lower 
bound proofs. So our lower-bounds apply to graphs, while our upper bounds 
apply to multigraphs as well. 

We will only consider optimization problems for which every feasible solu- 
tion has a non-negative objective function value. Throughout the paper, optp 
will denote the optimal solution value to an optimization problem P. When no 
confusion arises, we will simply denote it by opt. Given a feasible solution y to an 
instance x of P, val{x, y) denotes the value that the objective function of P takes 
on the feasible solution y. Consider a minimization (resp., maximization) prob- 
lem. Given a parameter p > 1, we say that an algorithm is a p-approximation 
algorithm if the value of the solution it returns is at most p • opt (resp., is at 
least opt/p). Similarly, given a parameter k and a function /(•), an 0{f{k))~ 
approximation algorithm is an algorithm that returns a solution whose value is 
0{f{k) ■ opt) (resp., 0{opt/ f{k))). We say that this heuristic approximates the 
problem within (a factor) 0{f{k)). 

A problem F is APX-hard if there exists some constant cr > 1 such that F 
cannot be a-approximated in polynomial time unless V = AfV . Occasionally we 
will establish hardness under different complexity assumptions such as AfV ^ 
QV, AfV % ZVV, etc. 

Given two optimization problems A and B, an L-reduction from A to B 
consists of a pair of polynomial-time computable functions (/, g) such that, for 
two fixed constants a and /3: (a) / maps input instances of A into input instances 
of B; (b) given an A- instance a, the corresponding B- instance /(a), and any 
feasible solution b for /(a), g{a,b) is a feasible solution for the A-instance a; 
(c) \optB{f{a))\ < a\optA{a)\ for all a, and (d) \optA{a) — val{a, g{a,b))\ < 
(3\optB{f{a)) — val{f{a),b)\ for each a and for every feasible solution b for /(a). 

2 Hardness of Approximation 

Recall that £ stands for the maximum length of any laid-out path of the VR- 
instance. 



2.1 Hardness of VR when £ Is Bounded 

In this subsection we show that the problem is APX-hard even for the special case 
in which £ equals 3. We will show in Section 3 that the problem is polynomially 
solvable for £= 1,2 and that there is a polynomial time |"£/2] -approximation 
algorithm. Our reduction is from the Max 3-Satisfiability (MAX 3-SAT) 
problem. To get a reasonable constant we make use of a major result by Hastad 
[6] which shows that the best factor one can hope for to approximate MAX 
3-SAT in polynomial time is unless AfV = V. The proof of the following 
theorem is given in the full paper. 
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Theorem 1. There is an L-reduction from MAX 3-SAT to VR (wherein i < 3), 
with a = ^ and /3 = 1 . 



Corollary 1. The existence of a polynomial-time — e)- approximation algo- 
rithm for VR, even in the case £ = 3 and for any fixed £ > 0, implies AfV = V . 

2.2 A Stronger Negative Result when £ Is Unbounded 

In this subsection we exhibit an L-reduction from the Symmetric Label Cover 
(SLC) problem to VR. The reduction implies that, unless AfV C QV, VR is hard 

1 _g. 

to approximate within 0(2^°®^ *”) for any fixed e. 

The problem SLC is the following. The input of the problem is a bipartite 
graph H = (L, R, E) and two sets of labels: A, to be used on the left hand side L 
only, and B, to be used on the right hand side R only. Each edge e has a list Pe 
of admissible pairs of the form (a, 6) where a G A and b G B. A feasible solution 
is a label assignment C A and By C B for every vertex u G L and v G R 
such that, for every edge e = uv, there is at least one admissible pair (a, b) G Pe 
with a G Ay, and b G By. The objective is to find a feasible label assignment 
which minimizes \£^u\ + J2vGR \By\. Let q := 

size of an SLC instance is equal to 0{q). 

1 _g. 

The existence of a polynomial-time 0(2^°®^ ^’)-approximation algorithm for 
SLC, for any £ > 0, implies that every problem in AfP can be solved by a quasi- 
polynomial-time algorithm, i.e.,AfV C QV [8,4] . We exhibit an L-reduction from 
SLC to VR with a = P = 1 (the proof is given in the full paper) . 

Theorem 2. There is an L-reduction from SLC to VR with a = P = 1. 

Noting that, in the L-reduction of Theorem 2, the number of arcs of the VR in- 
stance defined is 6>(X)e6_E l^e|), we get 

Corollary 2. The existence of a polynomial-time 0(2^°®^ ^)- approximation 

algorithm for VR, for any fixed £ > 0, implies JVV C QP. 

2.3 The Hardness of Approximating Gondolier Avoidance 

Recall that GA, the complementary problem to VR, asks for the maximum 
number of laid-out paths which can be removed without disconnecting t from s. 
We start by showing a very simple L-reduction from Set Cover (SC) to VR 
(the proof is given in the full paper) . 

Theorem 3. There is an L-reduction from SC to VR with a = P = 1. 

The Vertex Cover (VC) problem is the special case of SC where each ele- 
ment of the ground set is contained in exactly two subsets. The problem there- 
fore can be represented by an undirected graph G where vertices correspond to 
subsets and edges denote nonempty intersections between pairs of subsets. If 
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we complement the objective function of VC, we have the well-known Indepen- 
dent Set (IS) problem. Hence, Theorem 3 yields also an L-reduction from IS to 
GA with a = (3 = 1. Recalling the most up-to-date inapproximability results for 
IS [5], along with the structure of the GA instances defined by the reduction, 
yields the following corollary. 

Corollary 3. The existence of a polynomial-time approximation al- 

gorithm for GA, for any e > 0, implies AfV C ZW. The existence of a 
polynomial-time approximation algorithm for GA, for any fixed 

e > 0, implies MV = V . 

3 Approximation Algorithms 

Having shown that even some simple cases of the problem are computationally 
difficult, we turn to presenting approximation algorithms. Our algorithm are 
variations of the following idea. We divide the laid-out paths into “long” and 
“short” . To find an st-path we take all “long” paths, for there cannot be too many 
of these. Then we put a weight of 1 on every arc belonging to each “short” path, 
a weight of 0 on all other arcs and run a shortest path algorithm (henceforth 
SPA). This weighting scheme overestimates the cost of using “short” paths and 
hence they will be used parsimoniously. 

In many cases, we will try all the possible values of opt. Note that, for VR, 
we know that opt < n — 1, as the solution may be assumed to be a simple 
path, and opt < p, as in the worst case all the laid out paths are intersected by 
the optimal solution. (In short, we can assume w.l.o.g. that the value of opt be 
known.) The term /i := minjn — l,p} will often appear in the time complexity 
of the algorithms we present. We will use a weight function defined as follows. 
Given a nonnegative integer x, for each arc a of the graph, let 

, , j 1 if a belongs to some laid-out path of length < x, 

Zx(a) ■ 0 otherwise. 

It is not hard to see: 

Lemma 1. The number of paths of length < x intersected by the SPA solution 
with weights Zx is at most opt • x. 



3.1 Constant Factor Approximation for Fixed t 

Recall that the maximum length of a laid-out path is denoted by We now 
present an |"£/2] -approximation, which is especially useful in the common case 
where f is “small” (in practice, lightpaths will typically only consist of a few 
arcs). 

The first algorithm we present, called Aq, initially assigns weight 1 to each 
arc that lies in some laid-out path, and weight 0 to all other arcs. Then, for each 
laid-out path of length at least 2 and consecutive arcs {u,v), (v,t) in the path. 
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Aq adds a new arc (u,t) with weight 1. Finally, Aq applies SPA to the graph 
obtained. At the end, possible new arcs are replaced by the corresponding pair 
of “real” consecutive arcs. Also, since all arc-weights lie in {0, 1}, SPA only takes 
0(m) time here. 

The following fact follows from Lemma 1, since the new “shortcut” arcs (such 
as (u,t)) that we add above essentially “reduce” £ to |"^/2]: 

Fact 1. Algorithm Aq returns a solution of value at most opt • [^/2] in 0{m) 
time. 

The following corollaries are immediate consequences of the above fact. 

Corollary 4. Algorithm Aq is an 0{m)-time \£/2~\ -approximation algorithm for 
VR. 

Corollary 5. Algorithm Aq is an 0(jn)-time exact algorithm for VR when £ < 

2 . 



It is also easy to see that the above approach yields an 0{SSSP{n,m))-time 
[£/2] -approximation algorithm for PVR. For GVR, we use the same algorithm, 
except that we do not add any “shortcut arcs” such as {u, t) above. Lemma 1 
shows that this is an 0(m)-time ^-approximation algorithm for GVR. 

3.2 An Approximation Algorithm for Unbounded £ 

We next show an approximation algorithm for unbounded £, called Ai. The 
approximation guarantee of this algorithm is 0{y^mjopf); we then show how 
this factor of opt in the denominator of the approximation bound, can be used 
to show that VR (and hence SLG) are different from the Chromatic Number 
problem, in a certain quantitative sense related to approximability. 

Let X := y/mfopt. As opt is unknown, algorithm Ai tries all the 0{pL) pos- 
sible values opt, considering the associated x value. Given the graph G and any 
nonnegative integer x, let 

/(G, x) := (# of laid-out paths of length > x). 

For each value of x, Ai assigns weight (a) to each arc a of the graph, and applies 
SPA to the corresponding instance. On output, Ai gives the best solution found 
by the SPA calls for the different values of x. 

Lemma 2. For any nonnegative integer x: (a) f{G,x) < m/{x-\- 1); (b) the 
solution obtained by applying SPA after having assigned weight Zx(a) to each arc 
a of the graph has value at most opt • x f{G, x). 

Proof. Since all laid-out paths are arc-disjoint, the number of paths of length 
more than x is at most mj {x-\- 1), yielding (a). Moreover, for each x, we know by 
Lemma 1 that the number of paths of length < x intersected by the SPA solution 
with weights Zx is at most opt • x. So the total number of paths intersected is at 
most opt ■ x f{G, x), showing (b). 
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Theorem 4. (a) Algorithm Ai runs in 0{g,m) time, and returns a solution of 
value 0{^/opt • m). (b) A solution of value 0{y/opt ■ m), with a slightly worse 
constant than in the hound of (a), can also he computed in time 0(m log /i). 

Proof. By Lemma 2 (a) and (b), the total number of paths intersected is at 
most mf{x + 1) + opt • x. This quantity is minimized when x is approximately 
and the minimum value is 0{^Jopt ■ m). Since we run SPA for each 
possible value of opt, the running time follows, for part (a). Part (b) follows by 
guessing opt = 1, 2, 4, — 



Corollary 6. Algorithm A\ is an 0{^/rnJffpi) -approximation algorithm for VR. 

The above approach yields an 0(A/m/opt)-approximation algorithm for GVR 
as well. 

Notation (g-Exhaustive Search). For a given VR instance, suppose we know 
that opt is at most some value q. Then, the following simple algorithm to find an 
optimal solution, called q-exhaustive search, will be useful in a few contexts from 
now on: enumerate all subsets X of the set of laid-out paths such that \X\ < q, 
and check if the laid-out paths in X alone are sufficient to connect s to t. The 
time complexity of this approach is 




In particular, if g is a constant, then we have a polynomial-time algorithm. 

A Separation Result. We now use Theorem 4 to show a “separation” result 
that indicates that VR is likely to be in an approximation class different from 
that to which Chromatic Number (CN) belongs. 

Let N denote the size of an instance of SAT, and s denote the number of 
vertices in a CN instance. Note that the size S' of a CN instance is at most 
O(s^). Civen any fixed £ > 0 and any oracle that approximates CN to within 
work of [7] presents a randomized polynomial-time algorithm that uses 
this oracle to correctly solve SAT instances with a probability of at least 2/3. 
Thus, if a positive constant 6 is such that any randomized algorithm for SAT 
must have running time 17(2^ ), then there is a constant d' > 0 such that for 
any constant £ > 0, there is no 0(2^ )-time 0(S'^/^“®)-approximation algorithm 
for CN. We now show that the situation is different for VR. 

Theorem 5. For any fixed £ > 0, there is a 2^^"^" -time mf~^)-appro- 
ximation algorithm for VR. 

Proof. As before, we assume we know opt, in the sense that we are allowed 
to try the procedure below for all the 0{p) possible values of opt. Suppose a 
positive constant e is given. Theorem 4 shows that in polynomial time, we can 
find a VR solution of value 0{y/opt ■ m) = 0{opt • yjm/ opt) . Thus, if opt > , 
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we can approximate the problem to within a factor 0(i/ m/opt) < 
in polynomial time. Accordingly, we assume from now on that opt < m®. In 
this case, we can do an m^-exhaustive search to find an optimal solution, with 
running time 0(m • (T))> which is m • 



3.3 Randomized Sparsification: Improvements when opt Is Small 

We next show how to improve on the 0{^opt ■ m) solution value for VR in 
case opt is “small”, i.e., O(logn). In particular, for any constant C > 0, we 
present a randomized polynomial-time algorithm that with high probability 

computes a solution of value at most O instance, if opt = -yiog n, 

A 2 computes a solution of value O difference between Ai and 

A 2 is a randomized “sparsification” technique. Note that, due to Corollary 6, 
the following discussion is interesting only if opt < log n. 

Suppose that for a given constant C > 0, we aim to find a solution of value 
O • We will assume throughout that n is sufficiently large as a function 

of C, i.e., n > /(C) for some appropriate function /(•). Indeed, if n < /(C), 
then opt < logn is bounded by a constant, and we can find an optimal solution 
by a polynomial-time (log n)-exhaustive search. 

As in algorithm Ai, algorithm A 2 has to try all the possible 0(/r) values of 
opt, computing a solution for each, and returning on output the best one found. 
In the following, we will consider the iteration with the correct opt value. Let 
A := and x := Y\Jml(ppt ■ A)J , where C is the given constant. 

If opt < 2C, algorithm A 2 finds an optimal solution by a (2C)-exhaustive 
search. Hence, in the following we describe the behavior of A 2 for opt > 2C. By 
the definition of x and the fact that m > n, we have 



m 

— > 

Ax 



opt-m 1/4 
> n ' 



A 



( 1 ) 



The deletion of a laid-out path corresponds to the removal of all the associated 
arcs from G. For each possible value of opt, algorithm A2 uses the following 
“sparsification” procedure: Each laid-out path is independently deleted with 
probability (1 — 1/A). Then, SPA is applied on the resulting graph, after having 
assigned weight Zx{a) to each arc a that was not removed. The sparsification 
procedure and the associated shortest path computation are repeated O(n^) 
times, and the best solution found is outputted. 

Theorem 6. Given any constant C > 0 , Algorithm A2 runs in polynomial time 
and returns a VR solution of value O ^ ^ with high prohahility. 

Proof. Let P be an optimal VR solution. We have that 

Pr[all laid-out paths intersected by P survive] = (1/A)°^‘ = n 



(2) 
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Moreover, let Z be the number of laid-out paths of length more than x that 
remain undeleted. By Lemma 2 (a) and linearity of expectation, E[Z] < ml (Ax). 
Using a simple Chernoff bound [13], we get 

< (e/4)W(Ax) < (g/4)M/^, (3) 



Pr 



Z > 



2m 

Ax 



by (1), where e denotes the base of the natural logarithm as usual. 

As mentioned earlier in this subsection, we may assume that n is large 
enough: we will assume that n~^ — (e/4)”^^^ > l/(2n‘^). Thus, with proba- 
bility at least n~^ — (e/4)” > l/(2n^), we have that (i) all laid-out paths 

intersected by P survive, and (ii) Z < 2m/(Ax). Thus, sparsification ensures that 
/(G, x) is at most 2m/ (Ax) after path deletion. By Lemma 2, running SPA yields 



a solution of value at most opt ■x+'^ = O (y\J ■ The above success prob- 

ability of l/(2n'") can be boosted to, say, 0.9 by repeating the procedure 0{nP) 
times. The time complexity follows immediately from the algorithm description. 



Corollary 7. Algorithm A 2 (with C = 1, say) is a randomized 0{y^m/ \ogn)~ 
approximation algorithm for VR. (In other words, Algorithm A 2 always runs in 
polynomial time, and delivers a solution that is at most 0{y'm/ logn) times the 
optimal solution with high probability.) 



Proof. By Theorem 6, A 2 with G = 1, computes a solution of value opt ■ 
0{^ym/{opt • n^/°P)). Now, elementary calculus shows that opt • is mini- 
mized when opt = 0(logn). Thus, opt ■ O is at most opt ■ 

0{\/m/log n). 



Also, Algorithm A 2 can be derandomized, as shown by the following theorem, 
whose proof is deferred to the full paper. 



Theorem 7. For any given constant C > 0, there is a deterministic polynomial- 
time approximation algorithm for VR that returns a solution of value 




The same approach can be applied to GVR, by deleting the laid-out sets instead 
of the laid-out paths. 



3.4 Simple Algorithms for the Weighted Case 

Recall that in WVR we have a weight for each laid-out path, which equals 
its length. We now show a simple 0(-y/m)-approximation for WVR. A simple 
algorithm A3 returns a solution of value at most opt^ for WVR. This algorithm 
guesses the value of opt as 1,2,4...; note that opt < m. If opt > ^/m, we simply 
include all laid-out paths, leading to an 0(-y/^-approximation. 
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Suppose opt < -y/m- We can restrict attention to laid-out paths of length at 
most opt. Accordingly, A3: (i) removes all the laid-out paths of length > opt] (ii) 
assigns weight 1 to all remaining arcs that lie in laid-out paths, and weight 0 to 
arcs that do not lie in any laid-out path; and (iii) applies SPA. Clearly, the path 
that we get has cost at most opt“^. 

Theorem 8 . Algorithm A 3 is an 0{m\ogm)-time 0{Ta.Yo.{opt,^Jrn\)- approxi- 
mation algorithm for WVR. 
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Abstract. We study the problem of finding highly connected subgraphs 
of undirected and directed graphs. For undirected graphs, the notion of 
density of a subgraph we use is the average degree of the subgraph. 
For directed graphs, a corresponding notion of density was introduced 
recently by Kannan and Vinay. This is designed to quantify highly con- 
nectedness of substructures in a sparse directed graph such as the web 
graph. We study the optimization problems of finding subgraphs maxi- 
mizing these notions of density for undirected and directed graphs. This 
paper gives simple greedy approximation algorithms for these optimiza- 
tion problems. We also answer an open question about the complexity 
of the optimization problem for directed graphs. 



1 Introduction 

The problem of finding dense components in a graph has been extensively stud- 
ied [1,2, 4, 5, 9]. Researchers have explored different definitions of density and ex- 
amined the optimization problems corresponding to finding substructures that 
maximize a given notion of density. The complexity of such optimization prob- 
lems varies widely with the specific choice of a definition. In this paper, the 
notion of density we will be interested in is, loosely speaking, the average de- 
gree of a subgraph. Precise definitions for both undirected and directed graphs 
appear in Section 1.1. 

Recently, the problem of finding relatively highly connected sub-structures in 
the web graph has received a lot of attention [8,10,11,12]. Experiments suggest 
that such substructures correspond to communities on the web, i.e. collections of 
pages related to the same topic. Further, the presence of a large density of links 
within a particular set of pages is considered an indication of the importance of 
these pages. The algorithm of Kleinberg [10] identifies hubs (resource lists) and 
authorities (authoritative pages) amongst the set of potential pages relevant to 
a query. The hubs are characterized by the presence of a large number of links to 
the authorities and the authorities are characterized by the presence of a large 
number of links from the hubs. 

* Research supported by the Pierre and Christine Lamond Fellowship, an ARO MURI 
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while the author was visiting IBM Almaden Research Center. 
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Kannan and Vinay [9] introduce a notion of density for directed graphs that 
quantifies relatively highly connected and is suitable for sparse directed graphs 
such as the web graph. This is motivated by trying to formalize the notion of 
finding sets of hubs and authorities that are highly connected relative to the 
rest of the graph. In this paper, we study the optimization problem of finding 
a subgraph of maximum density according to this notion. We now proceed to 
formally define the notions of density that we will use in this paper. These are 
identical to the definitions in [9] . 



1.1 Definitions and Notation 

Let G{V, E) be an undirected graph and S GV. We define E{S) to be the edges 
induced by S, i.e. 

E{S) = {ijeE:ie S,j G 



Definition 1. Let S CV. We define the density f{S) of the subset S to be 



f{S) 



\E{S)\ 

1^1 



We define the density f{G) of the undirected graph G{V, E) to be 



/(G) = max{/(5')} 



Note that 2/(5') is simply the average degree of the subgraph induced by 
S and 2/(G) is the maximum average degree over all induced subgraphs. The 
problem of computing /(G) is also known as the Densest Subgraph problem and 
can be solved using flow techniques. (See Chapter 4 in Lawler’s book [13]. The 
algorithm, due to Gallo, Grigoriadis and Tarjan [7] uses parametric maximum 
flow which can be done in the time required to do a single maximum flow com- 
putation using the push-relabel algorithm) . 

A related problem that was been extensively studied is the Densest k-Subgraph 
Problem, where the goal is to find an induced subgraph of k vertices of maximum 
average degree [1,2, 4, 5]. Relatively little is known about the approximability of 
this problem and resolving this remains a very interesting open question. 

We now define density for directed graphs. Let G{V, E) be a directed graph 
and S,T CV. We define E{S,T) to be the set of edges going from S to T, i.e. 

E{S,T) = {ijeE:teS,j€T}. 



Definition 2. Let S,T C V. We define the density d{S,T) of the pair of sets 
S, T to be 



d{S,T) 



\E{S,T)\ 

vwm 



We define the density d{G) of the directed graph G{V, E) to be 



d(G)=^rnax^K5,T)} 
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The above notion of density for directed graphs was introduced by Kannan 
and Vinay [9]. The set S corresponds to the hubs and the set T corresponds 
to the authorities in [10]. Note that for [S'] = [Tj, d{S,T) is simply the average 
number of edges going from a vertex in S' to T (or the average number of edges 
going into a vertex in T from S). Kannan and Vinay explain why this definition 
of density makes sense in the context of sparse directed graphs such as the web 
graph. Note that in the above definition, the sets S and T are not required to 
be disjoint. 

The problem of computing d{G) was considered in [9] . They obtain an 0(log n) 
approximation by relating d{G) to the singular value of the adjacency matrix 
of G and using the recently developed Monte Carlo algorithm for the Singular 
Value Decomposition of a matrix [3,6]. They also show how the SVD techniques 
can be used to get an O(logn) approximation for /(G). They leave open the 
question of resolving the complexity of computing d{G) exactly. 

In this paper, we prove that the quantity d{G) can be computed exactly using 
linear programming techniques. We also give a simple greedy 2-approximation 
algorithm for this problem. As a warmup, we first explain how /(G) can be com- 
puted exactly using linear programming techniques. We then present a simple 
greedy 2- approximation algorithm for this problem. This proceeds by repeatedly 
deleting the lowest degree vertex. Our algorithm and analysis for computing the 
density d{G) for directed graphs builds on the techniques for computing /(G) 
for undirected graphs. 



2 Exact Algorithm for f{G) 

We show that the problem of computing /(G) can be expressed as a linear pro- 
gram. We will show that the optimal solution to this LP is a convex combination 
of integral solutions. This result by itself is probably not that interesting given 
the flow based exact algorithm for computing /(G) [7]. However, the proof tech- 
nique will lay the foundation for the more complicated proofs in the algorithm 
for computing d(G) later. 

We use the following LP: 



Vzj e E 
yij e E 



max Xij 


(1) 


ij 




VI 


(2) 


Xij < Vj 


(3) 


< 1 


(4) 




(5) 



Lemma 1. For any S C V, the value of the LP (1 )-(5) is at least f{S). 

Proof. We will give a feasible solution for the LP with value f{S). Let a; = 

For each i G S, set iji = x. For each ij G E{S), set Xij = x. All the remaining 




Greedy Approximation Algorithms 



87 



variables are set to 0. Now, yi = |S'| • a; = 1. Thus, {x,y) is a feasible solution 
to the LP. The value of this solution is 






\E{S)\ 

| 5 | 



f{S) 



This proves the lemma. 



Lemma 2. Given a feasible solution of the LP (l)-(5) with value v we can 
construct S CV such that /(S') > v. 

Proof. Consider a feasible solution (x,y) to the LP (l)-(5). Without loss of 
generality, we can assume that for all ij, Xij = imn{yi,yj). 

We define a collection of sets S indexed by a parameter r > 0. Let S(r) = 
{i : iji > r} and E{r) = {ij : Xij > r}. Since Xij < fji and Xij < yj, ij G E{r) ^ 
i G S{r),j G S{r). Also, since Xij = mm{yi,ijj), i G S{r),j G S{r) ij G E{r). 
Thus E{r) is precisely the set of edges induced by S(r). 

Now, |S(r)|dr = 1- Note that /“ \E{r)\dr = J2ij %• This is the 

objective function value of the LP solution. Let this value be v. 

We claim that there exists r such that |i?(r)|/|S(r)| > v. Suppose there were 
no such r. Then 

/ \E{r)\dr <v |S'(r)|dr < v. 

Jo Jo 

This gives a contradiction. To find such an r, notice that we can check all combi- 
natorially distinct sets S{r) by simply checking the sets S{r) obtained by setting 
r = iji for every i gV . 

Putting Lemmas 1 and 2 together, we get the following theorem. 

Theorem 1. 



= OPT{LP) (6) 

where OPT{LP) denotes the value of the optimal solution to the LP (l)-(5). 
Further, a set S maximizing f{S) can he computed from the optimal solution to 
the LP. 

Proof. First we establish the equality (6). From Lemma 1, the RHS > the LHS. 
(Consider the S that maximizes f{S)). From Lemma 2, the LHS > the RHS. 
The proof of Lemma 6 gives a construction of a set S that maximizes /(S') from 
the optimal LP solution. 

3 Greedy 2-Approximation for f{G) 

We want to produce a subgraph of G of large average degree. Intuitively, we 
should throw away low degree vertices in order to produce such a subgraph. 
This suggests a fairly natural greedy algorithm. In fact, the performance of such 
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an algorithm has been analyzed by Asahiro, Iwama, Tamaki and Tokuyama [2] 
for a slightly different problem, that of obtaining a large average degree subgraph 
on a given number k of vertices. 

The algorithm maintains a subset S of vertices. Initially S' ^ F. In each 
iteration, the algorithm identifies iminj the vertex of minimum degree in the 
subgraph induced by S. The algorithm removes imin from the set S and moves 
on to the next iteration. The algorithm stops when the set S is empty. Of all the 
sets S constructed during the execution of the algorithm, the set S maximizing 
/(S) (i.e. the set of maximum average degree) is returned as the output of the 
algorithm. 

We will prove that the algorithm produces a 2 approximation for /(G). There 
are various ways of proving this. We present a proof which may seem complicated 
at first. This will set the stage for the algorithm for d{G) later. Moreover, we 
believe the proof is interesting because it makes connections between the greedy 
algorithm and the dual of the LP formulation we used in the previous section. 

In order to analyze the algorithm, we produce an upper bound on the optimal 
solution. The upper bound has the following form: We assign each edge ij to 
either i or j. For a vertex i, d{i) is the number of edges ij or ji assigned to i. Let 
rfmax _ niaxi{d(t)}. (Another way to view this is that we will orient the edges 
of the graph and is the maximum number of edges oriented towards any 
vertex). The following lemma shows that f{S) is bounded by d'^’^^. 

Lemma 3. 



max{/(5)} < 



Proof. Consider the set S that maximizes f{S). Now, each edge in E{S) must 
be assigned to a vertex in S. Thus 



|A(S')| < IS'I 

m = 

This concludes the proof. 

Now, the assignment of edges to one of the end points is constructed as 
the algorithm executes. Initially, all edges are unassigned. When the minimum 
degree vertex is deleted from S, the vertex is assigned all edges that go from the 
vertex to the rest of the vertices in S. We maintain the invariant that all edges 
between two vertices in the current set S are unassigned; all other edges are 
assigned. At the end of the execution of the algorithm, all edges are assigned. 

Let d™^^ be defined as before for the specific assignment constructed corre- 
sponding to the execution of the greedy algorithm. The following lemma relates 
the value of the solution constructed by the greedy algorithm to 

Lemma 4. Let v be the maximum value of /(S') for all sets S obtained during 
the execution of the greedy algorithm. Then < 2v. 
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Proof. Consider a single iteration of the greedy algorithm. Since tmin is selected 
to be the minimum degree vertex in S, its degree is at most 2|i?(S')|/|S'| < 2v. 
Note that a particular vertex gets assigned edges to it only at the point when it 
is removed from S. This proves that < 2v. 

Putting Lemmas 3 and 4 together, we get the following. 

Theorem 2. The greedy algorithm gives a 2 approximation for f{G). 

Running Time. It is easy to see that the greedy algorithm can be implemented 
to run in 0{n^) time for a graph with n vertices and m edges. We can maintain 
the degrees of the vertices in the subgraph induced by S. Each iteration involves 
identifying and removing the minimum degree vertex as well as updating the 
degrees of the remaining vertices both of which can be done in 0(n) time. Using 
Fibonacci heaps, we can get a running time of 0(m + nlogn) which is better 
for sparse graphs. 



3.1 Intuition Behind the Upper Bound 

The reader may wonder about the origin of the upper bound on the optimal 
solution used in the previous section. In fact, there is nothing magical about 
this. It is closely related to the dual of the LP formulation used in Section 2. In 
fact, the dual of LP (l)-(5) is the following: 





min 7 


(7) 


Vij € E aij 


+ (dij > 1 


(8) 


\/i 7 > ^ aij 

-> 


+ X! 

d 


(9) 


J 


J 

Uij,j > 0 


(10) 


The upper bound constructed corresponds 


to a dual solution where 


aij, Pi j are 



0-1 variables, = 1 corresponds to the edge if being assigned to i and ( 3 ij = 1 
corresponds to the edge ij being assigned to j. Then 7 corresponds to 
In effect, our proof constructs a dual solution as the greedy algorithm executes. 
The value of the dual solution is the upper bound in Lemma 3. 

We now proceed to the problem of computing d{G) for directed graphs G. 
Here, the ideas developed in the algorithms for /(G) for undirected graphs G 
will turn out to be very useful. 



4 Exact Algorithm for d{G) 

Recall that d(G) is the maximum value of d{S, T) over all subsets S, T of vertices. 
We first present a linear programming relaxation for d{G). Our LP relaxation 
depends on the value of |5'|/|T| for the pair S,T that maximizes d{S,T). Of 
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course, we do not know this ratio a priori, so we write a separate LP for ev- 
ery possible value of this ratio. Note that there are 0{ri?) possible values. For 
|S'|/|T| = c, we use the following LP relaxation LP{c). 



max Xij 


(11) 


ij 




Xij ^ Si 


(12) 


Vij Xij < tj 


(13) 


sj < 
i 


(14) 


J 


(15) 


i^ij ^ ^i^tj ^0 


(16) 



We now prove the analogue of Lemma 1 earlier. 

Lemma 5. Consider S,T CV. Let c = |S'|/|T| . then the optimal value of LP{c) 
is at least d{S, T). 

Proof. We will give a feasible solution {x,s,f) for LP{c) (11)-(16) with value 
d{S,T). Let X = ^ = For each i £ S, set Si = x. For each j £ T, set 

tj = X. For each ij £ E{S, T), set Xij = x. All the remaining variables are set to 
0. Now, Si = [S'! • x = -/c and - ij = \T\-x = Thus, this is a feasible 

solution to LP{c). The value of this solution is 



|A(5,r)| 



V~c-\T\ 



\E{S,T)\ 

VWm 



d{S,T) 



This proves the lemma. 

The following lemma is the analogue of Lemma 2. 

Lemma 6. Given a feasible solution of LP{c) with value v we can construct 
S,T CV such that d{S,T) > v. 

Proof. Consider a feasible solution (x,s,i) to LP{c) (11)-(16). Without loss of 
generality, we can assume that for all ij, Xij = min(si,tj). 

We define a collection of sets S, T indexed by a parameter r > 0. Let S{r) = 
{i : Si > r}, T{r) = {j : ij > r} and E{r) = {ij : Xij > r}. Since Xij < Si and 
Xij < ij, ij £ E{r) ^ i £ S{r),j £ T{r). Also since Xij = min(sj,tj). Thus E(r) 
is precisely the set of edges that go from S{r) to T{r). 

Now, /“ \S{r)\dr = J2iSi< Vc- Also, /“ |T(r)|dr = % the 

Schwarz inequality. 
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Note that \E{r)\dr = This is the objective function value of the 

solution. Let this value be v. 

We claim that there exists r such that \E{r)\/ ^y\S{r)\\T{r)\ > v. Suppose 
there were no such r. Then 

pOO pOC) 

/ \E{r)\dr < V / \/|S'(r)||r(r)|(ir < v. 

Jo Jo 

This gives a contradiction. To find such an r, notice that we can check all com- 
binatorially distinct sets S{r),T(r) by simply checking S{r),T{r) obtained by 
setting r = Si and r = tj for every i G V, j G V. 

Note that the pair of sets S, T guaranteed by the above proof need not satisfy 
|S'|/|T| = c. Putting Lemmas 5 and 6 together, we obtain the following theorem. 



Theorem 3. 



T)} = max{OPT(LP(c))} (17) 

where OPT{LP{c)) denotes the value of the optimal solution to LP{c). Further, 
sets S, T maximizing d{S, T) can he computed from the optimal solutions to the 
set of linear programs LP{c). 

Proof. First we establish the equality (17). From Lemma 5, the RHS > the 
LHS. (Consider the S,T that maximize d{S,T)). From Lemma 6, the LHS > 
the RHS. (Set c to be the value that maximizes OPT{LP{c)) and consider the 
optimal solution to LP{c).) The proof of Lemma 6 gives a construction of sets 
S,T maximizing d{S,T) from the LP solution that maximizes OPT{LP{c)). 



Remark 1. Note that the proposed algorithm involves solving 0{n^) LPs, one 
for each possible value of the ratio c = |S'|/|T|. In fact, this ratio can be guessed 
to within a (1 + e) factor by using only 0(i^^|-^) values. It is not very difficult to 
show that this would yield a (1 + e) approximation. Lemma 5 can be modified 
to incorporate the (1 + e) factor. 



5 Approximation Algorithm for d{G) 

5.1 Intuition behind Algorithm 

Drawing from the insights gained in analyzing the greedy algorithm for approx- 
imating f(G), examining the dual of the LP formulation for d{G) should give 
us some pointers about how a greedy algorithm for d{G) should be constructed 
and analyzed. 
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The dual of LP{c) is the following linear program: 



■ ^ 

min VC • 7 -1 = 

VC 


( 18 ) 


Vij 


oiij + (3ij > 1 


( 19 ) 


Vi 


j 


(20) 


Vj 




(21) 




Oiij, ^,5 > 0 


(22) 



Any feasible solution to the dual is an upper bound on the integral solution. 
This naturally suggests an upper bound corresponding to a dual solution where 
aij,Pij are 0-1 variables, = 1 corresponds to the edge ij being assigned to 
i and fiij = 1 corresponds to the edge ij being assigned to j. Then 7 is the 
maximum number of edges ij assigned to any vertex i (maximum out-degree) . 5 
is the maximum number of edges ij assigned to a vertex j (maximum in-degree) . 
Then the value of the dual solution -y/c • 7 -|- ^ is an upper bound on the d{S, T) 
for all pairs of sets S', T such that |S|/|T| = c. 

5.2 Greedy Approximation Algorithm 

We will now use the insights gained from examining the dual of LP(c) to con- 
struct and analyze a greedy approximation algorithm. As in the exact algorithm, 
we need to guess the value of c = |S|/|T|. For each such value of c, we run a 
greedy algorithm. The best pair S, T (i.e. one that maximizes d{S,T)) produced 
by all such greedy algorithms is the output of our algorithm. 

We now describe the greedy algorithm for a specific value of c. The algorithm 
maintains two sets S and T and at each stage removes either the minimum degree 
vertex in S or the minimum degree vertex in T according to a certain rule. (Here 
the degree of a vertex t in S is the number of edges from i to T. The degree of 
a vertex j in T is similarly defined) . 

1. Initially, S ^V,T ^V. 

2. Let imin be the vertex i £ S that minimizes |A({i},T)|. Let ds <— |A({iniin},T')|- 

3. Let jmin be the vertex j € T that minimizes |A(5', {j})|. Let dr ^ |A(S', {jmin})|- 

4. If ■ ds < ^ ■ dx then 

set S < S {^min} 

else set T ^ T - {jmin}. 

5. If both S and T are non-empty, go back to Step 2. 

Of all the sets S, T produced during the execution of the above algorithm, 
the pair maximizing d{S,T) is returned as the output of the algorithm. 

In order to analyze the algorithm, we produce an upper bound on the optimal 
solution. The upper bound has the following form suggested by the dual of LP(c): 
We assign each (directed) edge ij to either i or j. For a vertex i, dout{i) is the 
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number of edges if assigned to i. For a vertex j, dm(j) is the number of edges 
i'j assigned to j. Let = ma,Xi{dout{i)} and dff^ = rnaxj {dm{j)} ■ The 
following lemma gives the upper bound on d{S, T) for all pairs S, T such that 
|S'|/|T| = c in terms of dff^ and d'ff^. 

Lemma 7. 



max {d(S', T)} < >/c • 

|S|/|T|=c^ V : /J - V 




Proof. This follows directly from the fact that the assignment of edges to vertices 
corresponds to a 0-1 solution of the dual to LP{c). Note that the value of the 
corresponding dual solution is exactly fc-dfff+ We give an alternate, 

combinatorial proof of this fact. 

Consider the pair of sets S, T that maximizes d{S, T) over all pairs S, T such 
that |S'|/|T| = c. Now, each edge in E{S,T) must be assigned to a vertex in S 
or a vertex in T. Thus 



d{S,T) 



\E{S,T)\ 

\E{S,T)\ 



<\S\-dZr+\T\-dl 




iniax _|_ 7max 

^out ' rr ' ^in 



Now, the assignment of edges to one of the end points is constructed as 
the algorithm executes. Note that a separate assignment is obtained for each 
different value of c. Initially, all edges are unassigned. When a vertex is deleted 
from either S' or T in Step 4, the vertex is assigned all edges that go from the 
vertex to the other set (i.e. if tmin is deleted, it gets assigned all edges from imin 
to T and similarly if jmin is deleted). We maintain the invariant that all edges 
that go from the current set S to the current set T are unassigned; all other 
edges are assigned. At the end of the execution of the algorithm, all edges are 
assigned. 

Let dr" and dr" be defined as before for the specific assignment constructed 
corresponding to the execution of the greedy algorithm. The following lemma 
relates the value of the solution constructed by the greedy algorithm to dr" 
and dr"- 

Lemma 8. Let v be the maximum value of d{S, T) for all pairs of sets S, T 
obtained during the execution of the greedy algorithm for a particular value of c. 
Then fc ■ dr" < v and ^ • dr" < v. 

Proof. Consider an execution of Steps 2 to 5 at any point in the algorithm. 
Since imin is selected to be the minimum degree vertex in S, its degree is at 
most \E{S,T)\/\S\, i.e. ds < |A(S', T)|/|S'|. Similarly dr < |A(S',T)|/|T|. Now, 



min 



{Vc ■ ds, —j=dT) < \/ dsdr < 
VC 



imr)l 

V\W\ 



< V. 
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If \/c-ds < ^dr, then imin is deleted and assigned the edges going from i^in to 
T. In this case, ^/c-ds < v. If this were not the case, jmin is deleted and assigned 
edges going from S to jmin- In this case, < v- Note that a particular vertex 

gets assigned edges to it only if it is removed from either 5* or T in Step 4. This 
proves that ^/c ■ dd^^^ < v and < v. 

Putting Lemmas 7 and 8 together, we get the following. 

Lemma 9. Let v be the maximum value of d{S, T) for all pairs of sets S, T 
obtained during the execution of the greedy algorithm for a particular value of c. 
Then, 

v> - max {d(S', T)}. 

- 2 |S|/|T|=c^ ^ 

Observe that the maximizing pair S, T in the above lemma need not satisfy 
\S\/\T\ = c. 

The output of the algorithm is the best pair S, T produced by the greedy 
algorithm over all executions (for different values of c). Let S*,T* be the sets 
that maximize d{S, T) over all pairs S, T. Applying the previous lemma for the 
specific value c = |S'*|/|T*|, we get the following bound on the approximation 
ratio of the algorithm. 

Theorem 4. The greedy algorithm gives a 2 approximation for d{G). 

Remark 2. As in the exact LP based algorithm, instead of running the algorithm 
for all values of c, we can guess the value of c in the optimal solution to 

within a (1 + e) factor by using only values. It is not very difficult to 

show that this would lose only a (1 + e) factor in the approximation ratio. We 
need to modify Lemma 7 to incorporate the (1 + e) factor. 

Running Time. Similar to the implementation of the greedy algorithm for /(G), 
the greedy algorithm for d(G) for a particular value of c can be implemented 
naively to run in Olpif) time or in 0{m + nlogn) time using Fibonacci heaps. 
By the above remark, we need to run the greedy algorithm for 0(^2221) values 
of c in order to get a 2 + e approximation. 

6 Conclusion 

All the algorithms presented in this paper generalize to the setting where edges 
have weights. In conclusion, we mention some interesting directions for future 
work. In the definition of density d{G) for directed graphs, the sets S, T were not 
required to be disjoint. What is the complexity of computing a slightly modified 
notion of density d'{G) where we maximize d{S, T) over disjoint sets S,T 7 Note 
that any a-approximation algorithm for d{G) can be used to obtain an 0{a)~ 
approximation for d'{G). Finally, it would be interesting to obtain a flow based 
algorithm for computing d{G) exactly, along the same lines as the flow based 
algorithm for computing /(G). 
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Abstract. In this paper, we derive bounds on performance guarantees 
of online algorithms for real-time preemptive scheduling of jobs with 
deadlines on K machines when jobs are characterized in terms of their 
minimum stretch factor a (or, equivalently, their maximum execution 
rate r = l/^)- We consider two well known preemptive models that 
are of interest from practical applications: the hard real-time scheduling 
model in which a job must be completed if it was admitted for execution 
by the online scheduler, and the firm real-time scheduling model in which 
the scheduler is allowed not to complete a job even if it was admitted 
for execution by the online scheduler. In both models, the objective is to 
maximize the sum of execution times of the jobs that were executed to 
completion, preemption is allowed, and the online scheduler must imme- 
diately decide, whenever a job arrives, whether to admit it for execution 
or reject it. We measure the competitive ratio of any online algorithm 
as the ratio of the value of the objective function obtained by this al- 
gorithm to that of the best possible offline algorithm. We show that no 
online algorithm can have a competitive ratio greater than 1 — (1/a) -I- e 
for hard real-time scheduling with K > 1 machines and greater than 
1 — (3/(4 [a])) -be for firm real-time scheduling on a single machine, 
where e > 0 may be arbitrarily small, even if the algorithm is allowed to 
know the value of a in advance. On the other hand, we exhibit a simple 
online scheduler that achieves a competitive ratio of at least 1 — (1/a) in 
either of these models with K machines. The performance guarantee of 
our simple scheduler shows that it is in fact an optimal scheduler ior hard 
real-time scheduling with K machines. We also describe an alternative 
scheduler for firm real-time scheduling on a single machine in which the 
competitive ratio does not go to zero as a approaches 1. Both of our 
schedulers do not know the value of a in advance. 



1 Introduction 

The need to support applications with real-time characteristics, such as speech 
understanding and synthesis, animation, and multimedia, has spurred research 
in operating system frameworks that provide quality of service (QoS) guarantees 
for real-time applications that run concurrently with traditional non-real-time 
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workloads [8,9,11,12,13,14,19,22,23,24,27]). In such a framework, a real-time ap- 
plication negotiates with the resource manager a range of “operating levels” at 
which the application can run depending on the availability of resources. Based 
on the current state of the system, the resource manager may increase or decrease 
the application’s operating level within this pre-negotiated range. As an example, 
consider an application that displays video frames over a network link. Such an 
application may require a nominal rate of 30 frames/second, but if it is not possi- 
ble to achieve this rate, the rate may be reduced to 15 frames/second by skipping 
every other frame and still achieve reasonable quality. If there are still inadequate 
system resources, the rate may be further reduced to say 7.5 frames/second (by 
skipping every fourth frame). However, below this minimum rate the application 
would produce completely inadequate performance and hence should not be run. 

In this paper, we consider the problem of scheduling real-time jobs for the 
case when the job’s ’’operating level” is characterized by its stretch factor, which 
is defined as ratio of its response time (i.e., total time in the system) to its 
execution time. Specifically, we consider the problem of scheduling a set of n 
independent jobs, J = { Ji, J 2 , . . . , Jn} on K machines Mi, M2, ■ ■ ■ , Mk- Each 
job Ji is characterized by the following parameters: its arrival time a{Ji), its 
execution time e{Ji,j) on machine Mj, and its stretch factor a{Ji) {a{Ji) > 1), 
or equivalently its rate r{Ji) = l/a{Ji) (0 < r{Ji) < 1). The parameter a{Ji) 
determines how late Ji may be executed on machine Mj] specifically, Ji must be 
completed no later than its deadline d{Ji) = a{Ji) + e{Ji, j)a{Ji) on machine Mj. 
A valid schedule for executing these jobs is one in which each job Ji is scheduled 
on at most one machine, each machine Mj executes only one job at any time and 
a job is executed only between its arrival time and its deadline. Preemption is 
allowed during job execution, i.e., a job may be interrupted during its execution 
and its processor may be allocated to another job. However, migration of jobs 
is not allowed, i.e., once a job is scheduled to run on a specific machine, it can 
only be executed on that machine. 

We are interested in online preemptive scheduling algorithms based on two 
real-time models. In the hard real-time model, every job that is admitted by the 
system must be completed by its deadline or the system will be considered to 
have failed. This in contrast to a soft real-time system [20] that allows jobs to 
complete past their deadlines with no catastrophic effect except possibly some 
degradation in performance. In [1] Baruah et. al considered a special case of 
a soft real-time system, called a firm real-time system, in which no “value” is 
gained for a job that completes past its deadline. We consider both the hard and 
firm real-time models in this paper. In both cases, we are interested in online 
scheduling algorithms that maximizes the utilization, which is defined as the 
sum of the execution times of all jobs that are completed by their deadlines. 
Notice that in the hard real-time model, the scheduler must only admit jobs 
that are guaranteed to complete by their deadlines. In contrast, in the firm real- 
time model, the scheduler may admit some jobs that do not complete by their 
deadlines, but such jobs do not contribute to the utilization of the resulting 
schedule. We measure the performance of the online algorithm in terms of its 
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competitive ratio, which is the ratio of the utilization obtained by this algorithm 
to that of the best possible offline algorithm. 

For a given set of jobs J = { Ji, J 2 , . . . , Jn}, an important parameter of 
interest in adaptive rate-controlled scheduling is their maximum execution rate 
r{J) = max{r(Ji) | 1 < i < n} (or, equivalently, their minimum stretch a{J) = 
min{o;(Jj) | 1 < t < n}). In fact, it is not difficult to see that without any 
bound on the execution rates of jobs no online algorithm for the hard real-time 
scheduling model has a competitive ratio greater than zero in the worst case 
(unless additional assumptions are made about the relative execution time of the 
jobs [18]). However, natural applications in rate-controlled scheduling leads to 
investigation of improved bounds on the competitive ratios of online algorithms 
with a given a priori upper bound on the execution rates of jobs. Because the 
stretch factor metric has been widely used in the literature (e.g., see [6,7,21]), we 
choose to continue with the stretch factor characterization of jobs (rather than 
the rate characterization) in the rest of the paper and present all our bounds in 
terms of the stretch factors of jobs. 

The scheduling problem for jobs with deadlines (both preemptive and non- 
preemptive versions) has a rather rich history. Below we just provide a synopsis 
of the history, the reader is referred to a recent paper such as [4] for more de- 
tailed discussions. The offline non-preemptive version of the problem for a single 
machine is NP-hard even when all the jobs are released at the same time [25]; 
however this special case has a fully polynomial-time approximation scheme. The 
offline preemptive version of the scheduling problem was studied by Lawler [17], 
who found a pseudo-polynomial time algorithm, as well as polynomial time al- 
gorithms for two important special cases. Kise, Ibaraki and Mine [15] presented 
solutions for the special case of the offline non-preemptive version of the prob- 
lem when the release times and deadlines are similarly ordered. Two recent pa- 
pers [3,5] considered offline non-preemptive versions of the scheduling problems 
on many related and/or unrelated machines and improved some of the bounds 
in a previous paper on the same topic [4,26]. On-line versions of the problem for 
preemptive and nonpreemptive cases were considered, among others, in [2,16,18]. 
Baruah et. al. [2] provide a lower and upper bound of 1/4 for one machine and 
an upper bound of 1/2 for two machines on the competitive ratio for online 
algorithms for firm real-time scheduling. Lipton and Tomkins [18] provide an 
online algorithm with a competitive ratio of 0(l/(log Z\)^+^) for scheduling in- 
tervals in the hard real-time model, where A is the ratio of the largest to the 
smallest interval in the collection and £ > 0 is arbitrary, and show that no online 
algorithm with a competitive ratio better than 0(1/ log Z\) can exist for this 
problem. Some other recent papers that considered various problems related to 
stretch factors of jobs (such as minimizing the average stretch factor during 
scheduling) are [6,7,21]. 

The following notations and terminologies are used in the rest of paper. 
Ak denotes an online preemptive scheduling algorithm for K machines and 
OPTk is an optimal offline preemptive scheduling algorithm for K machines. 
When K = 1, we will simply denote them by A and OPT, respectively. Un- 
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less otherwise stated, n denotes the number of jobs. For a given set of jobs J, 
Uak{J) (respectively, Uoptk{J)) denotes the processor utilization of Ak (re- 
spectively, of OPTk)- We say that an online scheduler Ak has competitive ratio 
p{Ak,C(),0 < p{Ak,q) < 1, if and only if 



UaAJ) 

Uoptk(J) 



> p{Ak,o) 



for every job set J with a{J) > a. Finally, Ak has a time complexity of 0{m) if 
and only if each decision of the scheduler can be implemented in time 0(m). Our 
main results are as follows. In Section 2, we show that p(Ak, «) < 1 — (l/a) -I- e 
for hard real-time scheduling, and p(A,a) < 1 — 3/(4|"o;]) -|- e for firm real-time 
scheduling, where e > 0 may be arbitrarily small. In Section 3, we design a 
simple 0(n) time online scheduler Ak with p{Ak, a) > 1 — (l/a) for either the 
hard real-time or the firm real-time scheduling model. In Section 4, we describe 
an online scheduler for firm real-time scheduling on a single machine in which 
the competitive ratio does not go to zero as a approaches one. Due to space 
limitations, some proofs are omitted. 



2 Upper Bounds of the Competitive Ratio 

In this section, we prove upper bounds of the competitive ratio for hard real-time 
scheduling with K machines and firm real-time scheduling for a single machine. 
We need the following simple technical lemma which applies to both hard and 
firm real-time scheduling. 

Lemma 1. Assume that a>l is an integer and we have a set of a jobs, each 
with stretch factor a, execution time a; > 0, and the same arrival time t. Then, 
it is not possible to execute all these a jobs, together with another job of positive 
execution time, during the time interval [t,t + ax], in either the hard or the firm 
real-time scheduling model. 

2.1 Hard Real-Time Scheduling for K Machines 

Theorem 1. For every a > 1, every K > 1, any arbitrarily small e > 0, and 
for any online preemptive scheduling algorithm Ak, p{Ak,ck) < 1 — (l/o) +£. 

Proof. For later usage in the proof, we need the following inequality: for any 
a > 1, — 5-2- < This is true since — < SxA > a > 1. 

It is sufficient to prove the theorem for all e < 1/a. The proof proceeds by 
exhibiting explicitly a set of jobs with stretch factor at least a for which 
p{Ak,o) < 1 — (1/a) + e for any scheduling algorithm Ak. All jobs in the 
example have stretch factor a. Furthermore, any job J in our example has the 
same execution time on all machines, hence we will simply use the notation e( J) 
instead of e(J, j). We break the proof into two cases depending on whether a is 
an integer or not. 
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Case 1: a is an integer. Let a = iJ^ae) 0 < (5 < be any value. Note 
that a > a > 1. The strategy of the adversary is as follows: 

(a) At time 0, the adversary generates a job Jo with e( Jo) = 1 and a( Jo) = a. 
Ak must accept this job because otherwise the adversary stops generating any 
more jobs and the competitive ratio is zero. 

(b) At time 5i, ion \ < i < K — 1, the adversary stops (does not generate 
any more jobs) if Ak did not accept Ji_i; otherwise it generates a job Ji with 
^{Ji) = ^TT X)fe=o s('^fc) ct{Ji) = ce. Notice that e( Ji) > 1 for all 0 < i < K. 

First, we note that Ak is forced to accept all the jobs. Otherwise, if Ak did 
not accept job Ji (for some 1 < i < AT — 1), then since OPTk can execute all of 
the jobs Jo, Ji, J 2 , ■ ■ ■ , Ji (each on a different machine), we have 



p{AK,a) 



nJoejJj) 



1 



1 + 



e(A) 

X)j=o 



1 



1 + 



0—1 

a 



1 — (1/0) + £ 



as promised. Next, we show that Ak cannot execute two of these jobs on the 
same machine. Suppose that job Ji {i > 0) is the first job that is executed on 
the same machine executing another job Jk for some k < i. We know that Ji 
arrives after J^. Hence, if Ji has to execute to completion on the same machine 
together with Jfc, then we must have e(Ji) < (a — l)e(Jfc). However, 



2 ^ 2 

e{Jj) > ^ e(Jfc) > (o - l)e(Jfc) > {a - l)e( Jfe) 
“ ^ j=o “ 



Hence, if Ak still did not give up, it must be executing all the K jobs, each on 
a different machine. 

(c) Now, the adversary generates a new set of aK jobs. Each job arrives at 
the same time 5K , has the same stretch factor a, and the same execution time 
e = x(X)^o^ e( Ji)) where x > 2a. Notice that none of the previous K jobs 
finished on or before the arrival of these new set of jobs since 5K < 1 and 
e(Ji) > 1 for all i. Hence, by Lemma 1, at most {a — 1) new jobs can be executed 
by Ak on a single machine. Hence, U{Ak) < X)iLo^ e(Ji) + K{a — l)e. The 
optimal scheduling algorithm OPTk will on the other hand reject all previous 
K jobs and accept all the new aK jobs. Thus, U{OPTk) > Kae and hence 



Ua + _ l + K{a-l)x , 

UoPT ~ Kae Kax ~ 

where the last inequality follows from the fact that x > 2a > e. 

Case 2: a is not an integer. Let p = [aj > 1; thus a > p> 1. The strategy 
of the adversary is as follows. 

(a) First, as in Case 1, the adversary first generates the following jobs. At time 
0, the adversary generates a job Jq with e(Jo) = 1 and a(Jo) = a and at time 6i, 
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for 1 < i < ilT — 1, the adversary stops (does not generate any more jobs) if Ak 
did not accept Ji_i; otherwise it generates a job Ji with e(Ji) = 
and a{Ji) = a. The same argument as in Case 1 indicates that Ak must schedule 
all these jobs, each on a separate machine. 



(b) Let P > max 



f 1 1 




(a-l)(a-p)> ' 


^Oc-p J 



— I maxo<i</c_i 



e(Ji) 

e{Jj) 



I be a positive integer. Now, at time 6K, the adversary generates a job Jr 



with e{JK) = f3{c(—p)e{Jo) and a{JK) = ot and at time 5i, for K+1 < i < 2K—1, 
the adversary stops (does not generate any more jobs) if Ar did not accept Ji-i; 
otherwise it generates a job Ji with e(Jj) = (3{a — p)e{Ji-R) and a{Ji) = a. 

First, note that it is possible to execute all the 2K jobs by executing jobs Ji 
and JR+i, for 0 < i < iL — 1, on the same machine. Since the jobs Ji and jR+i 
are released SK time apart, after Ji finishes execution, the time left for jR+i to 
complete its execution is 



ae{jR+i) - e{Ji) + SK > [a - — ^ e( J^+j) > e{jR+i) 

\ P[a-p)J 

which is sufficient for its execution. Next, note that Ar must accept all the new 
K jobs. If Ar did not accept JR+i, for some 0 < i < K — 1, then since the 
adversary can execute all the jobs Ji, J 2 , . . . , JR+i, we have 



Ua < 
UoPT 



< 



e(Jj) 

AJj) 

I (/3(a-p)-l)e(Ji) 

/3(a-p) J2]=o 

2 ^ _ (/?(g-p)--l)e(Jd 

/3(g-p) AJj) 



since P{a 



_ , _ / {0{a-p)-l 

V d(g-p) 

<1 — ( 1 / 0 ) -|- £ 



AJi) 






P)>2 



where the last inequality follows since /3(a — p) > e{Ji)/{e{Ji) — ((1/a) + 
e{Jj)). Hence, after this step, Ar must be executing all the 2K jobs 
presented to it. 



(c) Now, similar to as in (c) in Case 1, the adversary generates a new set of aK 
jobs. Each job arrives at the same time 2SK, has the same stretch factor a, and 
the same execution time e = s( J^)) where x > 2a. In a manner similar 

to Case 1, it can be shown that at most (a — 1) new jobs can be executed by Ar 
on a single machine. Hence, U{Ar) = ^{Ji) + K{a — l)e. The optimal 

scheduling algorithm OPTr will on the other hand reject all previous 2K jobs 
and accept all the new aK jobs. Thus, U{OPTr) = Kae and hence 



Ua ^ Eifo ' e( J.) + K{a - l)e 
UopT Kae 



1 + K{a — l)a: 
Kax 



< 1 — {\/ a) e 



where the last inequality follows from the fact that x > 2a > e. 



□ 
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2.2 Firm Real-Time Scheduling For a Single Machine 

Theorem 2. For every a > 1, any arbitrarily small e > 0, and for any online 
preemptive scheduling algorithm A, p{A,a) < 1 — 3/(4|"a]) + s. 

3 An Online Scheduler for Both Models 

In this section, we exhibit a simple 0{n) time online scheduler of K machines, 
where n is the total number of jobs, that achieves a competitive ratio of at least 
1 — (l/o) for either hard or firm real-time scheduling. The online scheduler Ak 
is based on the EDF (Earliest Deadline First) scheduling algorithm [10]. When a 
job J arrives, Ak first runs an admission test on the machines Mi, M 2 , ■ ■ ■ , Mk, 
in any predetermined order, to check if all previously admitted jobs that have 
not yet completed, plus job J, can be completed by their respective deadlines 
on some machine Mj. If so Ak admits J on machine Mj, otherwise it rejects 
J. Admitted jobs in each machine are executed by Ak in nondecreasing order 
of their deadlines. Thus, preemptions may occur: a currently executing job will 
be preempted in favor of a newly admitted job with an earlier deadline. The 
preempted job resumes execution when there are no more admitted jobs with 
earlier deadlines. 

The details of the scheduling algorithm of Ak are as follows. Ak maintains a 
queue Qj of jobs that have been admitted but have not yet completed on machine 
Mj. Each job in the queue Qj contains three information: (1) its job number, 
(2) its deadline, and (3) its remaining execution time (i.e., its execution time 
minus the processor time that it has consumed so far) . The jobs in the queue are 
ordered by nondecreasing deadlines. Thus, if the machine Mj is busy, then it is 
executing the job at the head of the queue Qj (which has the earliest deadline) 
and the remaining execution time of this job decreases as it continues to execute. 
The job is deleted from Qj when its remaining execution time becomes zero, and 
the job (if any) that becomes the new head of the queue Qj is executed next on 
Mj . Clearly, the total time taken by the online scheduler over all machines when 
a new job arrives is 0{n). 

As a final note, it may happen that several jobs may arrive simultaneously; 
in this case, Ak processes these jobs in any arbitrary order. 

Before proceeding further, we first introduce some notations to simplify the 
equations that will be presented shortly. Let A be a set of jobs. We shall use 
X |<d {X |>d) to denote the subset of jobs in X whose deadlines are < d 
(respectively, > d). Additionally, we slightly abuse notation by using e{X,j) to 
denote the sum of the execution times of all jobs in X on machine Mj. 

Finally, the following inequality will be used later: 

Fact 1. > fj whenever u <v and a>b. 



Theorem 3. For every set of jobs J with a{J) = a and for every integer 

^ 2 1. 2 1 - <i/“) 
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Proof. Let J = { Ji, J2, . . . , J„} be the set of n jobs. We wish to compare the 
schedule of Ak with that of an optimal offline scheduler OPTk- For a given 
machine Mj , the schedule of Ak for this machine produces an execution profile 
that consists of an alternating sequence of busy and idle time intervals of Mj. 
A busy time interval of Mj corresponds to the case when the machine Mj is 
busy executing some job. Each busy interval (except the last) of Mj is separated 
from its next busy interval by an idle interval, during which the machine Mj 
is not executing any job. Define a busy time interval for the scheduler Ak to 
be a time interval during which all of its K machines are busy; otherwise call 
it a non-busy time interval of Ak. Observe that any job that arrived during a 
non-busy time interval of Ak must have been scheduled by Ak immediately, 
since at least one its machines was not executing any job. In other words, any 
job that was rejected by Ak must have arrived during a busy time interval of 
Ak. 

Let Bi,B 2 ,. . . , Bjn be the busy intervals of Ak. The jobs in J can then be 
partitioned into the following disjoint subsets: 

^ JiiJii ■ ■ ■ ■> Jm-, where Ji is the set of jobs that arrive during busy interval 
B,. 

— The remaining set J" = J — of jobs whose arrival time is during a 

non-busy time interval of Ak. All jobs in J" were executed by Ak. 

Let J' = Let J'j^^ (respectively, Jqptk) were 

executed by Ak (respectively, OPTk) from J' . We first claim that, to prove 
Theorem 3, it is sufficient to show that 



e(JAJ 

^^^OPTk) 



> 1 - (1/a) 



( 1 ) 



Why is this so? Assume that OPTk executes a subset Jqptk 
in J". Hence, UaAJ) = e(J'A^) + e(J"), UoptAJ) = <J'optA + <J'6ptA 
and e{J'') > e(J/,V^). Now, if e(J'^^) > e(J/,pj,^), then clearly - 

1 > 1 - (1/a). Otherwise, if e(J/i^) < e(J/,pp^), then by Fact 1, > 

Now we prove that Equation 1. Let Jf^’^ and be the subsets of 

jobs in Ji admitted by Ak and OPTk, respectively. Obviously, since e(J( 4 ^) = 

X)™ie(Jj"^^) and e-AopTA = 1 it suffices to prove that, for 

each busy interval Bi of Ak, 1 < z < m: 






> 1 - (1/a) 



( 2 ) 
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We now prove Equation 2. For notational convenience, we drop the subscript 
K from Ak and OPTk- Let (respectively, be the subset of jobs in 

(respectively, that were scheduled on machine Mj. Since e{j/^) = 

J2f=i y) = J2f=i e( to prove Equation 2 it is sufficient 

to show that 

ef 7 ) 

^ 

We now prove Equation 3 for an arbitrary machine Mj . For notational con- 
venience, for a set of jobs X we refer to e(X,j) simply by e(X). Let t be the 
time at which busy interval Bi begins; thus, all jobs in j- have arrival times 
no earlier than t. Let X G be a job with the latest deadline among all 

jobs in Jqpt ~ ^'a that was executed in machine Mj . If there is no such job 
X, then C jA and Equation 3 trivially holds, hence we assume such 

an X exists. Also, assume that e(f7A) < e{J^j^'^), otherwise again Equation 3 
trivially holds. 

By the admission test, A rejected X for Mj because its admission would have 
caused some job Y (possibly X) on Mj to miss its deadline d{Y)] i.e., 

t + e{J^^j |<d(v)) + e(A) > d{Y) 

The term t on the left hand side of the above equation is due to the fact that all 
jobs in have arrival times no earlier than t and hence cannot be executed 
before time t. Since jA = jA \<d(Y) U \>d(Y), we get: 

e{J,A^) > d{Y) - e{X) - t + e(J,j |>d(^)). (4) 

Now, since OPT must complete all jobs in Ji’j^'^ on Mj no later than their 
deadlines, it should be the case that t + |<c;) < d for every d. In partic- 

ular, when d = d{Y), we have: 

t + \<d(Y))<d{Y) 

Since |<d(y) U J^j^^ \>d(Y), we get: 

e(J°^^)<d(r)-t + e(J'0^^|>,(^)) 

By definition, X has the latest deadline among all jobs admitted by OPT but 
rejected by A on machine Mj. Also, by the admission test, Y is either X or 
some other job admitted by A with deadline d{Y) > d{X). It follows that 
|>d(v)C jA |>d(Y)- The above equation then becomes: 

e{J?r)<d{Y)-t + e{jf:^\^aiY)). 

From Equations 4 and 5 we get: 



(5) 
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. d{Y) -e{X)-t + e{J,j^\ ) 

d{Y)-t + e{jAU^Y)) 

^ ajy) + e{Y)a{Y) - e{X) - t + e(J,^- \^,^y)) 
a{Y) + e{Y)a{Y)-t + e{J,^^ |>d(y)) 

a(y)-t+e(J'^-\^^(Y)) I e{X) 

_ a{Y) > a{Y) 

g(y)-t+e(J'^^.|>rf(y)) 

a(Y) > 



> 



<Y) - St 



1 - 



e{Y) 

<X) 

a{Y)e{Y)' 



by Fact 1 



We now show that e{X)/a{Y) < e{Y)!a. If X = F, then e{X)/a{Y) = 
e{Y)/a{Y) < e(F)/o;, since a{Y) > a. li X ^ Y , then Y is some job previously 
admitted by A on machine Mj such that a(F) < a(X) and d{Y) > d{X). Thus: 



a{Y) + e(Y)a(Y) > a{X) + e{X)a{X) 

e(F)a(F) > (a(X) - a{Y)) + e{X)a{J) 
e{Y)a{Y) > e(X)a(X), since a{X) — a{Y) > 0 
e(F)a(F) > e(X)a, since a{X) > a. 



Hence, it follows that e{X)/a{Y) < e{Y)ja. We therefore conclude that 

SSL > 1 _ > 1 _ 1 

a{Y)e{Y) ~ a 



□ 



4 An Combined Schednler for Firm Real-Time 
Scheduling on a Single Machine 

The competitive ratio 1 — (1/ct) for the scheduler described in Section 3 ap- 
proaches zero as a approaches one. For the case of firm real-time scheduling 
on a single machine, this can be avoided by combining our scheduler with the 
version 2 of the TDi scheduler (Figure 2) of [2]. 

Of course, if a is known to the online algorithm in advance, then combining 
the two algorithms is easy: if a > 4/3, the scheduling algorithm of Section 3 is 
chosen; otherwise the TD\ scheduler of [2] is chosen. This would ensure that the 
competitive ratio of the combined scheduler is at least max{l/4, 1 — (1/a)}. The 
above discussion leads to the following simple corollary. 

Corollary 1. If the value of a is known to the online scheduler in advance, then 
there is a scheduler A with p{A,a) > max{l/4, 1 — (l/a)|. 
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However, in practice the value of a may not be known to the online scheduler 
in advance. In that case, a different strategy needs to be used. The following 
theorem can be proved using a new strategy. 



Theorem 4. Even if the value of a is not known in advance to the scheduler, 
it is possible to design a scheduler CS whose performance guarantee is given by 

rtCS.o) 



1 7/64 
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Abstract. Two natural classes of counting problems that are interre- 
ducible under approximation-preserving reductions are: (i) those that 
admit a particular kind of efficient approximation algorithm known as 
an “FPRAS,” and (ii) those that are complete for t^P with respect to 
approximation-preserving reducibility. We describe and investigate not 
only these two classes but also a third class, of intermediate complex- 
ity, that is not known to be identical to (i) or (ii). The third class can 
be characterised as the hardest problems in a logically defined subclass 
of #P. 



1 The Setting 

Not a great deal is known about the complexity of obtaining approximate so- 
lutions to counting problems. A few problems are known to admit an efficient 
approximation algorithm or “FPRAS” (definition below) . Some others are known 
not to admit an FPRAS under some reasonable complexity-theoretic assump- 
tions. In light of the scarcity of absolute results, we propose to examine the 
relative complexity of approximate counting problems through the medium of 
approximation-preserving reducibility. Through this process, a provisional land- 
scape of approximate counting problems begins to emerge. Aside from the ex- 
pected classes of interreducible problems that are “easiest” and “hardest” within 
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the counting complexity class , we identify an interesting class of natural in- 
terreducible problems of apparently intermediate complexity. 

A randomised approximation scheme (RAS) for a function / : E* ^ N is a 
probabilistic Turing machine(TM) that takes as input a pair {x, e) G A"* x (0, 1) 
and produces as output an integer random variable Y satisfying the condition 
Pr(e“® < Y/f{x) < e^) > 3/4. A randomised approximation scheme is said 
to be fully polynomial if it runs in time poly(|x|, £“^). The unwieldy phrase 
“fully polynomial randomised approximation scheme” is usually abbreviated to 
FPRAS. 

Suppose f,g : S* ^ N are functions whose complexity (of approximation) 
we want to compare. An approximation-preserving reduction from / to 5 is a 
probabilistic oracle TM M that takes as input a pair (x,e) G A* x (0, 1), and 
satisfies the following three conditions: (i) every oracle call made by M is of 
the form (w,6), where w G A* is an instance of g, and 0 < <5 < 1 is an error 
bound satisfying < poly(|a;|, e“^); (ii) the TM M meets the specification for 
being a randomised approximation scheme for / whenever the oracle meets the 
specification for being a randomised approximation scheme for g; and (iii) the 
run-time of M is polynomial in |x| and If an approximation-preserving 
reduction from f to g exists we write / <ap g, and say that / is AP-reducible 
to g. If / <AP g and g <ap / then we say that / and g are AP-interreducible, 
and write / =ap g- 

Two counting problems play a special role in this article. 

Name. #Sat. 

Instance. A Boolean formula ip in conjunctive normal form (CNF). 

Output. The number of satisfying assignments to (p. 

Name. #BIS. 

Instance. A bipartite graph B. 

Output. The number of independent sets in B. 

The problem 7 /: Sat is the counting version of the familiar decision problem Sat, 
so its special role is not surprising. The (apparent) significance of t/^BIS will 
only emerge from an extended empirical study using the tool of approximation- 
preserving reducibility. This is not the first time the problem 7 (fBIS has appeared 
in the literature. Provan and Ball show it to be #P-complete [10], while (in the 
guise of “2 BPMonDNF”) Roth raises, at least implicitly, the question of its 
approximability [11]. 

Three classes of AP-interreducible problems are studied in this paper. The 
first is the class of counting problems (functions A* ^ N) that admit an FPRAS. 
These are trivially AP-interreducible, since all the work can be embedded into 
the reduction (which declines to use the oracle) . The second is the class of count- 
ing problems AP-interreducible with ffSXT. As we shall see, these include the 
“hardest to approximate” counting problems within the class ffV . The third is 
the class of counting problems AP-interreducible with t/:BIS. These problems 
are naturally AP-reducible to functions in ffSKT, but we have been unable to 
demonstrate the converse relation. Moreover, no function AP-interreducible with 
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T^BIS is known to admit an FPRAS. Since a number of natural and reasonably 
diverse counting problems are AP-interreducible with ^^BIS, it remains a dis- 
tinct possibility that the complexity of this class of problems in some sense lies 
strictly between the class of problems admitting an FPRAS and #Sat. Per- 
haps significantly, #BIS and its relatives can be characterised as the hardest to 
approximate problems within a logically defined subclass of ^P that we name 
#RHni. 

Owing to space limitations, most proofs are omitted or abbreviated. Inter- 
ested readers may find complete proofs in the full version of the article [2]. 

2 Problems that Admit an FPRAS 

A very few non-trivial combinatorial structures may be counted exactly using a 
polynomial-time deterministic algorithm; a fortiori, they may be counted using 
an FPRAS. The two key examples are spanning trees in a graph (Kirchhoff), 
and perfect matchings in a planar graph (Kasteleyn). Details of both algorithms 
may be found in Kasteleyn’s survey article [9] . There are some further structures 
that can be counted in the FPRAS sense despite being complete (with respect 
to usual Turing reducibility) in ^P. Two representative examples are matchings 
of all sizes in a graph (Jerrum and Sinclair [6]) and satisfying assignments to a 
Boolean formula in disjunctive normal form (Karp, Luby and Madras [8]). 

3 Problems AP-Interreducible with ^Sat 

Suppose /, g : A* ^ N. A parsimonious reduction (Simon [13]) from f to g 
is a function g : S* ^ S* satisfying (i) f{w) = g{g{w)) for all w G S*, and 
(ii) g is computable by a polynomial-time deterministic Turing transducer. In 
the context of counting problems, parsimonious reductions “preserve the number 
of solutions.” The generic reductions used in the usual proofs of Cook’s theorem 
are parsimonious, i.e., the number of satisfying assignments of the constructed 
formula is equal to the number of accepting computations of the given Turing 
machine/input pair. Since a parsimonious reduction is a very special instance 
of an approximation-preserving reduction, we see that all problems in ^P are 
AP-reducible to ^Sat. Thus #Sat is complete for w.r.t. (with respect to) 
AP-reducibility. The same is obviously true of any problem in ^P to which 
#Sat is AP-reducible. 

Let A : E* ^ {0, 1} be some decision problem in NP. One way of expressing 
membership of A in NP is to assert the existence of a polynomial p and a 
polynomial-time computable predicate R (witness-checking predicate) satisfying 
the following condition: A(x) iff there is a word y G E* such that \y\ < p(|x|) 
and R{x, y). The counting problem, ^A : E* N, corresponding to A is defined 

by 

#A(x) = I {y I |y| < p(|a:|) and R{x,y)] |. 

Formally, the counting version ^A of A depends on the witness-checking pred- 
icate R and not just on A itself; however, there is usually a “natural” choice 
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for i?, so our notation should not confuse. Note that our notation for 7 ^ Sat and 
Sat is consistent with the convention just established, where we take “y is a 
satisfying assignment to formula cc” as the witness-checking predicate. 

Many “natural” NP-complete problems A have been considered, and in every 
case the corresponding counting problem is complete for w.r.t. (conven- 
tional) polynomial-time Turing reducibility. No counterexamples to this phe- 
nomenon are known, so it remains a possibility that this empirically observed 
relationship is actually a theorem. If so, we seem to be far from proving it or 
providing a counterexample. Strangely enough, the corresponding statement for 
AP-reducibility is a theorem. 

Theorem 1. Let A be an NP-eomplete decision problem. Then the correspond- 
ing counting problem, #A, is complete for ffP w.r.t. AP-reducibility. 

Proof. That #A G ffP is immediate. The fact that #Sat is AP-reducible to 
ffA is more subtle. Using the bisection technique of Valiant and Vazirani, we 
know [17, Cor. 3.6] that #Sat can be approximated (in the FPRAS sense) by 
a polynomial-time probabilistic TM M equipped with an oracle for the deci- 
sion problem Sat. Furthermore, the decision oracle for Sat may be replaced 
by an approximate counting oracle (in the RAS sense) for ffA, since A is NP- 
complete, and a RAS must, in particular, reliably distinguish none from some. 
(Note that the failure probability may be made negligible through repeated tri- 
als [7, Lemma 6.1].) Thus the TM M, with only slight modification, meets the 
specification for an approximation-preserving reduction from ffSAT to ffA. We 
conclude that the counting version of every NP-complete problem is complete 
for ffP w.r.t. AP-reducibility. □ 

The following problem is a useful starting point for reductions. 

Name. #LargeIS. 

Instance. A positive integer m and a graph G in which every independent set 
has size at most m. 

Output. The number of size-m independent sets in G. 

The decision problem corresponding to #LargeIS is NP-complete. There- 
fore, Theorem 1 implies the following: 

Observation 1. t(^:LargeIS =ap #Sat. 

Another insight that comes from considering the Valiant and Vazirani bisec- 
tion technique is that the set of functions AP-reducible to #Sat has a “struc- 
tural” characterisation as the class of functions that may be approximated (in the 
FPRAS sense) by a polynomial-time probabilistic Turing transducer equipped 
with an NP oracle. Informally, in a complexity-theoretic sense, approximate 
counting is much easier that exact counting: the former lies “just above” NP [15], 
while the latter lies above the entire polynomial hierarchy [16]. 

Theorem 1 shows that counting versions of NP-complete problems are all 
AP-interreducible. Simon, who introduced the notion of parsimonious reduc- 
tion [13], noted that many of these counting problems are in fact parsimoniously 
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interreducible with t^Sat. In other words, many of the problems covered by 
Theorem 1 (including ^LargeIS [2]) are in fact related by direct reductions, 
often parsimonious, rather than merely by the rather arcane reductions implicit 
in that theorem. 

An interesting fact about exact counting, discovered by Valiant, is that a 
problem may be complete for w.r.t. usual Turing reducibility even though 
its associated decision problem is polynomial-time solvable. So it is with ap- 
proximate counting. A counting problems may be complete for w.r.t. AP- 
reducibility when its associated decision problem is not NP-complete, and even 
when it is trivial, as in the next example. 

Name. #IS. 

Instance. A graph G. 

Output. The number of independent sets (of all sizes) in G. 



Theorem 2. =ap #Sat. 

The proof goes via the problem ^LargeIS. The reduction — essentially the 
same as one presented by Sinclair [14] — uses a graph construction that boosts 
the number of large independent sets until they form a substantial fraction of the 
whole. Other counting problems can be shown to be complete for ^P w.r.t. AP- 
reducibility using similar “boosting reductions.” There is a paucity of examples 
that are complete for some more “interesting” reason. One result that might 
qualify is the following: 

Theorem 3. remains complete for ffP w.r.t. AP-reducihility even when 

restricted to graphs of maximum degree 25. 

Proof. This follows from a result of Dyer, Frieze and Jerrum [3], though rather 
indirectly. In the proof of Theorem 2 of [3] it is demonstrated that an FPRAS 
for bounded-degree #IS could be used (as an oracle) to provide a polynomial- 
time randomised algorithm for an NP-complete problem, such as the decision 
version of satisfiability. Then ffSAT <ap #IS follows, as before, via the bisection 
technique of Valiant and Vazirani. □ 

Let H be any fixed, g-vertex graph, possibly with loops. An H-colouring of 
a graph G is simply a homomorphism from G to H. If we regard the vertices 
of H as representing colours, then a homomorphism from G to H induces a 
g-colouring of G that respects the structure of H: two colours may be adja- 
cent in G only if the corresponding vertices are adjacent in H. Some examples: 
AT,j-colourings, where Kq is the complete g-vertex graph, are simply the usual 
(proper) g-colourings; iL^'^^o^^ourings, where is K 2 with one loop added, are 
independent sets; and S'*-colourings, where S* is the g-leaf star with loops on 
all (j -I- 1 vertices, are configurations in the “g-particle Widom-Rowlinson model” 
from statistical physics. 
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Name. #(7- Particle- WR-Configs. 

Instance. A graph G. 

Output. The number of g-particle Widom-Rowlinson configurations in G, i.e., 
S'*-colourings of G, where S* denotes the g-leaf star with loops on all g -I- 1 
vertices. 

Aside from containing many problems of interest, i7-colourings provide an 
excellent setting for testing our understanding of the complexity landscape of 
(exact and approximate) counting. To initiate this programme we considered all 
10 possible 3- vertex connected Hs (up to symmetry, and allowing loops). The 
complexity of exactly counting iJ-colourings was completely resolved by Dyer 
and Greenhill [4]. Aside from H = (the complete graph with loops on all 
three vertices) and H = Ki ^2 = P 3 {Pn will be used to denote the path of 
length n — 1 on n vertices), which are trivially solvable, the problem of counting 
i/-colourings for connected three-vertex Hs is #P-complete. Of the eight Hs 
for which exact counting is ^P-complete, seven can be shown to be complete 
for ^P w.r.t. AP-reducibility using reductions very similar to those appearing 
elsewhere in this article. The remaining possibility for H is S '2 (i.e, 2-particle 
Widom-Rowlinson configurations) which we return to in the next section. Other 
complete problems could be mentioned here but we prefer to press on to a 
potentially more interesting class of counting problems. 



4 Problems AP-Interreducible with #BIS 

The reduction described in the proof of Theorem 2 does not provide useful 
information about 7 )^ BIS, since we do not have any evidence that the restriction 
of t^^LargeIS to bipartite graphs is complete for #P w.r.t. AP-reducibility.^ The 
fact that #BIS is interreducible with a number of other problems not known 
to be complete (or to admit an FPRAS) prompts us to study #BIS and its 
relatives in some detail. The following list provides examples of problems which 
we can show to be AP-interreducible with t(^:BIS. 

Name. #P 4 -Col. 

Instance. A graph G. 

Output. The number of P 4 -colourings of G, where P 4 is the path of length 3. 



Name. #Downsets. 

Instance. A partially ordered set (^, ^). 
Output. The number of downsets in (X, ^). 



^ Note that this statement does not contradict the general principle, enunciated in 
§3, that counting-analogues of NP-complete decision problems are complete w.r.t. 
AP-reducibility, since a maximum cardinality independent set can be located in a 
bipartite graph using network flow. 
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Name. ^IpInSat. 

Instance. A Boolean formula (p in conjunctive normal form (CNF), with at most 
one unnegated literal per clause, and at most one negated literal. 

Output. The number of satisfying assignments to p. 

Name. #BeachConfigs. 

Instance. A graph G. 

Output. The number of “Beach configurations” in G, i.e., P 4 -colourings of G, 
where P 4 denotes the path of length 3 with loops on all four vertices. 

Name. #P*-COL. 

Instance. A graph G. 

Output. The number of P*-colourings of G, where P* is the path of length q—1 
with loops on all q vertices. 

Note that an instance of ^IpInSat is a conjunction of Horn clauses, each 
having one of the restricted forms x ^ y, ^x, or y, where x and y are vari- 
ables. Note also that #2-Particle-WR-Configs and #BeachConfigs are 
the special cases q = 3 and <7 = 4, respectively, of #P*-COL. 

Theorem 4. The problems #BIS, #P4-Col, #Downsets, #1p1nSat and 
#P*-COL (for q > 3, including as special cases #2-Particle-WR-Configs 
and 7 )^:BEACHCONFiGSj are all AP-interreducible. 

Clearly, ffPf-COL is trivially solvable. Theorem 4 is established by exhibit- 
ing a cycle of explicit AP-reductions linking the various problems. One of those 
reductions is presented below, in order to provide a flavour of some of the tech- 
niques employed; the others may be found in [ 2 ] . 

Lemma 1. #BIS <ap #2-Particle-WR-Configs. 

Proof. Suppose B = {X,Y,A) is an instance of #BIS, where A G X xY . For 
convenience, X = {xq, . . . , x„_i} and Y = {j/o, • • ■ j 2/n-i}- Construct an instance 
G = {V, E) of #2 -Particle-WR-Configs as follows. Let Ui : t) < i < n - I 
and K all be disjoint sets of size 3n. Then define 

P = y UiO{vo,...,Vn-l}OK 

ie[n] 

and 

P = U Uf'' U ({uo, . . . ,u„_i} X a:) U U y {Pi X {vj} : (xi,yj) € A], 

ie[n] 



(2) 

where U( , etc., denotes the set of all unordered pairs of elements from Ui. So 
Ui and K all induce cliques in G, and all Vj are connected to all of K. Let the 
Widom-Rowlinson (W-R) colours be red, white and green, where white is the 
centre colour. Say that a W-R configuration (colouring) is full if all the sets 
Uq, . . . ,Un-i and K are dichromatic. (Note that each set is either monochro- 
matic, or dichromatic red/white or green/white.) We shall see presently that full 
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W-R configurations account for all but a vanishing fraction of the set of all W-R 
configurations. 

Consider a full W-R configuration C :V ^ {red, white, green} of G. Assume 
C{K) = {red, white}; the other possibility, with green replacing red is symmet- 
ric. Every full colouring in G may be interpreted as an independent set in B as 
follows: 

I = {xi\ green G G{Ui)] U {yj : G{vj) = red}. 

Moreover, every independent set in B can be obtained in this way from exactly 
(2^" — 2)"+^ full W-R configurations of G satisfying the condition G{K) = 
{red, white}. So |>V'(G)| = 2(2^” — 2)”+^ • \J{B)\, where W'(G) denotes the set 
of full W-R configurations of G, and the factor of two comes from symmetry 
between red and green. 

Crude counting estimates provide 

|W(G) \ W'(G)| < 3(n -h 1)(2 • 23")"3”, 
where >V(G) denotes the set of all W-R configurations of G. Since 
|W(G) \ W'(G)| < 2(23" _ 2)«+i 



for sufficiently large n, we have 



\AB)\ 



mG)\ 

2(23" _ 2)"+i 



Now to get the result we just need to show how to set the accuracy parameter 6 
in the definition of AP-reducibility. Details are given in the full version [2] . 



5 A Logical Characterisation of #BIS and Its Relatives 

Saluja, Subrahmanyam and Thakur [12] have presented a logical characterisa- 
tion of the class (and of some of its subclasses), much in the spirit of Fagin’s 
logical characterisation of NP [5] . In their framework, a counting problem is iden- 
tified with a sentence (p in first-order logic, and the objects being counted with 
models of ip. By placing a syntactic restriction on ip, it is possible to identify a 
subclass #RHIIi of #P whose complete problems include all the ones mentioned 
in Theorem 4. 

We follow as closely as possible the notation and terminology of [12], and di- 
rect the reader to that article for further information and clarification. A vocabu- 
lary is a finite set a = {i? 0 ) • • • , Rk-i} of relation symbols of arities tq, . . . , rk-i- 
A structure A = {A, Rq, . . . , Rk-i) over a consists of a universe (set of ob- 
jects) A, and relations Rq, , Rk-i of arities r^, , ru-i on A; naturally, each 
relation Ri C A"* is an interpretation of the corresponding relation symbol 

3 We have emphasised here the distinction between a relation symbol Ri and its in- 
terpretation Ri. From now on, however, we simplify notation by referring to both 
as Ri. The meaning shonld be clear from the context. 
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We deal exclusively with ordered finite structures; i.e., the size |^| of the uni- 
verse is finite, and there is an extra binary relation that is interpreted as a total 
order on the universe. Instead of representing an instance of a counting problem 
as a word over some alphabet S, we represent it as a structure A over a suitable 
vocabulary a. For example, an instance of #IS is a graph, which can be regarded 
as a structure A = (A, ~), where A is the vertex set and ~ is the (symmetric) 
binary relation of adjacency. 

The objects to be counted are represented as sequences T = (Tq, . . . ,T'r_i) 
and z = {zq, . . . ,Zm-i) of (respectively) relations and first-order variables. We 
say that a counting problem / (a function from structures over cr to numbers) 
is in the class if it can be expressed as 

/(A) = |{(T,z):Ah<^(z,T)}|, 

where is a first-order formula with relation symbols from cr UT and (free) vari- 
ables from z. For example, by encoding an independent set as a unary relation /, 
we may express quite simply as 

/is(A) = |{J : Vx,y. X ~ ^ ^/(x) V ^/(y)}|. 

Indeed, #IS is in the subclass #IIi C (so named by Saluja et ah), since 

the formula defining /is contains only universal quantification. Saluja et al. [12] 
exhibit a strict hierarchy of subclasses 

#So = #Ho C #Si C #Hi C #S2 C #H2 = = #P 

based on quantifier alternation depth. Among other things, they demonstrate 
that all functions in admit an FPRAS.^ 

All the problems introduced in §4, in particular those mentioned in Theo- 
rem 4, lie in a syntactically restricted subclass #RHIIi C ^Ai to be defined 
presently. Furthermore, they characterise /tRHIIi in the sense of being com- 
plete for //:RHni w.r.t. AP-reducibility (and even with respect to a much more 
demanding notion of reducibility). We say that a counting problem / is in the 
class /tRHIIi if it can be expressed in the form 

/(A) = |{(T,z):AhVy.^(y,z,T)}|, (1) 

where ^|) is an unquantified CNF formula in which each clause has at most one 
occurrence of an unnegated relation symbol from T, and at most one occurrence 
of a negated relation symbol from T. The rationale behind the naming of the 
class /tRHIIi is as follows: “Hi” indicates that only universal quantification is 
allowed, and “RH” that the unquantified subformula 4) is in “restricted Horn” 
form. Note that the restriction on clauses of ip applies only to terms involving 
symbols from T; other terms may be arbitrary. 

For example, suppose we represent an instance of t/Downsets as a structure 
A = {A, ^), where ^ is a binary relation (assumed to be a partial order). Then 

® The class is far from capturing all functions admitting an FPRAS. For example, 
i/DNF-Sat admits an FPRAS even though it lies in #^2 \ #IIi [12]. 
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T^Downsets G #RHIIi since the number of downsets in the partially ordered 
set {A, may be expressed as 

/ds(A) = \{D : Wx G A,y G A. D{x) Ay ^x^ D{y)}\, 

where we have represented a downset in an obvious way as a unary relation D 
on A. The problem t^IpInSat is expressed by a formally identical expression, 
but with ^ interpreted as an arbitrary binary relation (representing clauses) 
rather than a partial order. 

We are able to show: 

Theorem 5. t^^IpInSatzs complete for under parsimonious reducihil- 

ity. 

Proof (sketch). Consider the generic counting problem in as presented 

in equation (1). Suppose T = (Tq, . . . , T^_i), y = (i/o, • ■ • , 2/t-i) and z = 
{zo, . . . , Zm-i), where (T^) are relations of arity (U), and (yj) and (zk) are 
first-order variables. Let L = \A\^ and M = and let {po, ■ ■ ■ ,Vl-i) and 

(Co) • ■ ■ 5 Cm-i) be enumerations of A^ and A™. Then 



A h Vy.f/'(y,z,T) iff 



L-1 

A h A 

9=0 



T), 



and 



M-l 



/(A)= ^|{t 



s=0 



L-1 



9=0 



(2) 



where z/’g^s(T) is obtained from tp{riq, (g, T) by replacing every subformula that is 
true (resp., false) in A by TRUE (resp., FALSE). Now A9A V'9.s(T) is a CNF 
formula with propositional variables Ti^af) where a* G A‘L Moreover, there is 
at most one occurrence of an unnegated propositional variable in each clause, 
and at most one of a negated variable. Thus, expression (2) already provides an 
AP-reduction to #1p1nSat, since f{A) is the sum of the numbers of satisfying 
assignments to M (i.e. polynomially many) instances of ^IpInSat. 

The reduction as it stands is not parsimonious. With extra work, however, 
it is possible to combine the M instances of #1p1nSat into one; this process is 
described in the full paper [2] . 



Corollary 1. The problems #BIS, #P4 -Col, #P*-Col (for q > 3, includ- 
ing as special cases #2-Particle-WR-Configs and ^BeachConficsJ and 
#Downsets are all complete for #RHIIi w.r.t. AP-reducibility. 

Corollary 1 continues to hold even if “AP-reducibility” is replaced by a more 
stringent reducibility. In fact, most of our results remain true for more strin- 
gent reducibilities than AP-reducibility. This phenomenon is explored in the full 
article [2], where one such more demanding notion of reducibility is proposed. 
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6 Problems to which #BIS Is Reducible 

There are some problems that we have been unable to place in any of the three 
AP-interreducible classes considered in this article even though reductions from 
#BIS can be exhibited. The existence of such reductions may be considered 
as weak evidence for intractability, at least provisionally while the complexity 
status of the class #RHIIi is unclear. Two examples are #3-Particle-WR- 
CONFIGS (the special case of 7 (lg-PARTiCLE-WR-CONFiGS with q = 3) and 
^Bipartite q-COL: 

Name. ^Bipartite g-COL. 

Instance. A bipartite graph B. 

Output. The number of g-colourings of B. 

Theorem 6. #BIS is AP-reducihle to t(13-Particle-WR-Configs and #Bi- 
PARTITE g-COL. 

7 An Erratic Sequence of Problems 

In this section, we consider a sequence of i/-colouring problems. Let Wr^ be the 
graph with vertex set Vq = {a, &, ci, . . . , Cq\ and edge set 

Eq = {{a,b),{b,b)} U {(6, c*), (ci, c*) :l<i<q}. 

Wro is just K 2 with one loop added. Wri is called “the wrench” in [1]. Consider 
the problem ^g-WRENCH-COL, which is defined as follows. 

Name. #( 7 -Wrench-Col. 

Instance. A graph G. 

Output. The number of Wr^-colourings of G. 

Theorem 7. 

— For q < 1, #g-WRENCH-COL is AP-interreducible with #Sat. 

— t(12-Wrench-Col is AP-interreducible with #BIS. 

— For q>3, #g-WRENCH-COL is AP-interreducible with #Sat. 

Theorem 7 indicates that either (i) #BIS is AP-interreducible with (which 

would perhaps be surprising) or (ii) the complexity of approximately counting 
iL-colourings is “non-monotonic”: the complexity for Hs from a regularly con- 
structed sequence may jump down and then up again. 



Acknowledgements 

The case g = 1 of Theorem 7 is due to Mike Paterson. We thank Dominic 
Welsh for telling us about reference [12] and Marek Karpinski for stimulating 
discussions on the topic of approximation-preserving reducibility. 




On the Relative Complexity of Approximate Counting Problems 119 



References 

1. G.R. Brightwell and P. Winkler, Graph homomorphisms and phase transitions, 
CDAM Research Report, LSE-CDAM-97-01, Centre for Discrete and Applicable 
Mathematics, London School of Economics, October 1997. 

2. Martin Dyer, Leslie Ann Goldberg, Gatherine Greenhill and Mark Jerrum, On 
the Relative Complexity of Approximate Counting Problems, Computer Science 
Research Report CS-RR-370, University of Warwick, February 2000. 

http : //www . dcs .Warwick. ac . uk/pub/index . html 

3. M. Dyer, A. Frieze and M. Jerrum, On counting independent sets in sparse graphs. 
Proceedings of the 4 0th IEEE Symposium on Foundations of Computer Science 
(FOCS’99), IEEE Computer Society Press, 1999, 210-217. 

4. M. Dyer and C. Greenhill, The complexity of counting graph homomorphisms 
(Extended abstract). Proceedings of the 11th Annual ACM-SIAM Symposium on 
Discrete Algorithms (SODA’OO), ACM-SIAM 2000, 246-255. 

5. R. Fagin, Generalized first-order spectra and polynomial time recognisable sets. 
In “Gomplexity of Computation” (R. Karp, ed.), SIAM-AMS Proceedings 7 , 1974, 
43-73. 

6. M. Jerrum and A. Sinclair, The Markov chain Monte Carlo method: an approach to 
approximate counting and integration. In Approximation Algorithms for NP-hard 
Problems (Dorit Hochbaum, ed.), PWS, 1996, 482-520. reducibility 

7. M.R. Jerrum, L.G. Valiant and V.V. Vazirani, Random generation of combinatorial 
structures from a uniform distribution. Theoretical Computer Science 43 (1986), 
169-188. 

8. R.M. Karp, M. Luby and N. Madras, Monte-Carlo approximation algorithms for 
enumeration problems. Journal of Algorithms 10 (1989), 429-448. 

9. P.W. Kasteleyn, Graph theory and crystal physics. In Graph Theory and Statistical 
Physics (F. Harary, ed.). Academic Press, 1967, 43-110. 

10. J.S. Provan and M.O. Ball, The complexity of counting cuts and of computing the 
probability that a graph is connected, SIAM Journal on Computing 12 (1983), 
777-788. 

11. D. Roth, On the Hardness of approximate reasoning. Artificial Intelligence Jour- 
nal 82 (1996), 273-302. 

12. S. Saluja, K.V. Subrahmanyam and M.N. Thakur, Descriptive complexity of ffP 
functions, Journal of Computer and Systems Sciences 50 (1995), 493-505. 

13. J. Simon, On the difference between one and many (Preliminary version). Proceed- 
ings of the flh International Colloquium on Automata, Languages and Program- 
ming (ICALP), Lecture Notes in Computer Science 52, Springer- Verlag, 1977, 
480-491. 

14. A. Sinclair, Algorithms for random generation and counting: a Markov chain ap- 
proach, Progress in Theoretical Computer Science, Birkhauser, Boston, 1993. 

15. L. Stockmeyer, The complexity of approximate counting (preliminary version). 
Proceedings of the 15th ACM Symposium on Theory of Computing (STOC’83), 
ACM, 1983, 118-126. 

16. S. Toda, PP is as hard as the polynomial-time hierarchy, SIAM Journal on Com- 
puting 20 (1991), 865-877. 

17. L.G. Valiant and V.V. Vazirani, NP is as easy as detecting unique solutions. The- 
oretical Computer Science 47 (1986), 85-93. 




On the Hardness of Approximating 
Af"P Witnesses 



Uriel Feige, Michael Langberg, and Kobbi Nissim 



Department of Computer Science and Applied Mathematics 
Weizmann Institute of Science, Rehovot 76100 
{feige ,mikel, kobbi }@wisdom. weizmann. ac.il 



Abstract. The search version for Af'P-complete combinatorial optimiza- 
tion problems asks for finding a solution of optimal value. Such a solution 
is called a witness. We follow a recent paper by Kumar and Sivakumar, 
and study a relatively new notion of approximate solutions that ignores 
the value of a solution and instead considers its syntactic representation 
(under some standard encoding scheme). 

The results that we present are of a negative nature. We show that 
for many of the well known Af'P-complete problems (such as 3-SAT, 
CLIQUE, 3-COLORING, SET COVER) it is AfP-hard to produce a 
solution whose Hamming distance from an optimal solution is substan- 
tially closer than what one would obtain by just taking a random so- 
lution. In fact, we have been able to show similar results for most of 
Karp’s 21 original Af'P-complete problems. (At the moment, our results 
are not tight only for UNDIRECTED HAMILTONIAN CYCLE and 
FEEDBACK EDGE SET.) 

1 Introduction 

Every language L G AfV is characterized by a polynomial-time decidable re- 
lation TZl such that L = {x|(tc,x) G TZl for some tc}. Such a characterization 
intuitively implies that every x & L has a polynomial witness w for being an 
instance of L. For example in the language 3SAT the witness for a formula </> 
being satisfiable is a satisfying assignment a so that (j){a) = 1. 

Let 4>he a satisfiable 3SAT formula. The problem of constructing a satisfying 
assignment for (f> is ACT^-hard. This motivates an attempt to achieve an efficient 
approximation for a satisfying assignment. There are several possible notions of 
approximation. For instance, one may attempt to find an assignment that sat- 
isfies many of the clauses in (j). This is the standard notion of approximation 
that has been studied extensively. Given a formula (j) in which each clause is of 
size exactly three, the trivial algorithm that chooses an assignment at random is 
expected to satisfy at least 7/8 of the clauses. This is essentially best possible, as 
it is ACT^-hard to find an assignment that satisfies more than a 7/8 -I- £ fraction 
of the clauses [10]. We consider a different notion of approximation presented in 
[12] in which one seeks to find an assignment which agrees with some satisfying 
assignment on many variables. A random assignment is expected to agree with 
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some satisfying assignment on at least half of the variables. We show that it is 
A/”7^-hard to find an assignment that agrees with some satisfying assignment on 
significantly more than half of the variables. Our hardness results are based on 
Karp (many-to-one) reductions and thus also exclude the existence of random- 
ized algorithms that approximate some satisfying assignment with nonnegligible 
probability (under the assumption that MV does not have expected polynomial 
time algorithms). 

Recently, a randomized algorithm for finding a satisfying assignment for 
3SAT formulas was presented by Schoning [13]. This algorithm has a running 
time of 0((4/3)”), which is the best currently known. Roughly speaking, the 
algorithm generates an initial assignment that hopefully agrees with some sat- 
isfying assignment on significantly more than half of the variables, and then 
searches its neighborhood. The initial assignment is chosen at random, with ex- 
ponentially small success probability. The ability to find such an assignment in 
polynomial time would improve the running time of this algorithm. Our results 
essentially show the unlikeliness of improving the algorithm in this way. 

A natural question is whether the hardness of approximating witnesses phe- 
nomenon holds for A/”7^-complete problems other than 3SAT. The answer de- 
pends to some extent on the syntactic representation of witnesses. As shown 
in [12], if we are given sufficient freedom in how to encode witnesses, then the 
hardness of approximating witnesses extends to all other AfT^-complete problems. 
However, it is much more interesting to examine the witness approximation issue 
under more natural encodings of witnesses. 

We examine a large set of A/”7^-complete problems, namely Karp’s original list 
of 21 problems [11]. For each of these problems we consider a “natural” syntactic 
representation for witnesses. Under this representation, we prove tight hardness 
results similar to those proved for 3SAT. For example, we encode witnesses 
for 3-COLORING in ternary rather than binary, representing the colors of the 
vertices. In this case a random 3-coloring is expected to agree with some valid 
3-coloring on only 1/3 of the vertices, rather than 1/2. We show that given a 
3-colorable graph G = {V, E) one cannot find a 3-coloring that agrees with some 
valid 3-coloring of V on significantly more than 1/3 of the vertices. As another 
example, the encoding that we use for the HAMILTONIAN CYCLE problem 
is an indicator vector for the edges that participate in a Hamiltonian cycle. We 
consider graphs that contain roughly m = 2n edges, in which case a random 
indicator vector is within Hamming distance roughly m/2 from an indicator 
vector of a Hamiltonian cycle. We show that it is NP-hard to find an indicator 
vector of distance significantly less than m/2 from a witness in the directed 
version of HAMILTONIAN CYCLE. 

AfT^-complete languages exhibit diverse behavior with respect to the stan- 
dard notion of approximation. On the other hand, our results indicate that with 
respect to our notion of witness inapproximability, many A/”7^-complete prob- 
lems share the property that their witnesses cannot be approximated better 
than random. The only problems from [11] for which we were unable to fully 
demonstrate this phenomenon is the undirected version of HAMILTONIAN CY- 
CLE, and the FEEDBACK EDGE SET problem (for the latter we have tight 
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results on graphs with parallel edges) . At this point, it is not clear if this is just 
a matter of constructing more clever gadgets, whether it is an issue of choosing a 
different encoding scheme for witnesses (e.g., for HAMILTONIAN CYCLE one 
may encode a solution as an ordered list of vertices rather than as an indicator 
vector for edges), or whether there is a deeper reason for these two exceptions. 

Previous Work: As mentioned earlier, Kumar and Sivakumar [12] examine 
witness approximation properties of any language in MV when we are free to 
choose the syntactic representation of witnesses. They show that for any NV 
language L there exists a witness relation TZl such that it is AfP-hard given 
an instance x G L to produce an approximate witness w that agrees with some 
witness ru of a; on significantly more than half its bits. 

In our notion of witness approximation, one has to produce a solution of 
small Hamming distance from a valid witness, but one need not know in which 
bits of the solution the difference lies. A different problem that is well studied is 
whether one can recover with certainty just one bit of a witness (e.g., the value 
of one variable in a satisfying assignment for a 3SAT formula 4>). The answer to 
this is known to be negative (for typical AfP-complete problems), due to the self 
reducibility property (e.g., </> can then be reduced to a new formula with one less 
variable). Making use of the self reducibility property on one particular instance 
involves a sequence of reductions. Gal et al. [5] show a related hardness result via 
a Karp reduction. They show that a procedure A that selects any subset of ^/n 
variables and outputs their truth values in some satisfying assignment for <j) may 
be used for solving SAT, with a single call to A. They show similar results for 
other problems such as graph isomorphism and shortest lattice vector. Related 
results of the same nature appear also in [12,9]. 

One may also consider the approximation of a specific property of a satisfying 
assignment for a 3SAT formula. For example, the maximum number of variables 
assigned a true truth value. Zuckerman [15] considered the following maximiza- 
tion version of A/”7^-complete problems. Let TZl{w, x) be a polynomial-time pred- 
icate corresponding to an A/”7^-complete language L so that L = {xldic (w,x) G 
TZl}, where w = {0,1}’”. Let S C {!,..., m| and view w as a, subset of 
{!,..., mj. Given x,S, compute max [A n w\ over w satisfying TZl{w,x). For 
all the 21 A/”7^-complete problems in Karp’s list, Zuckerman shows that this 
function is A/”7^-hard to approximate within a factor of n® for some constant 
e > 0. 

Tools and Overview: A major breakthrough that led to many hardness of 
approximation results is the PGP theorem [4,2,1]. Our results follow from weaker 
tools, and are not based on the PGP theorem. Specifically, our results are based 
on the generic family of NV problems for which it is AfT^-hard to obtain solutions 
within small Hamming distance of witnesses presented in [12]. This construction, 
in turn, is based on efficient list decodeable error correcting codes. Such codes 
are presented in [9]. 

The standard reductions for proving the AfT^-completeness of problems pre- 
serve the hardness of exactly finding witnesses. However, in most cases, these 
reductions do not preserve the hardness of approximating witnesses. In this re- 




On the Hardness of Approximating MV Witnesses 



123 



spect our situation is similar to that encountered for the more standard notion 
of approximating the value of the objective function. Nevertheless, we typically 
find that in such reductions, some particular part of the witness (e.g., the value 
of some of the variables in a satisfying assignment) is hard to approximate within 
a small Hamming distance. We call this part the core of the problem. (The rest 
of the witness is typically the result of introducing auxiliary variables during the 
reduction.) 

Our reductions identify the core of the problem and amplify it, so that it 
becomes almost all of the witness, resulting in the hardness of approximating 
the whole witness. This amplification is very much problem dependent. For some 
problems, it is straightforward (e.g., for 3SAT it is a simple duplication of vari- 
ables), whether for others we construct special gadgets. 

In Section 2 we present the relation mentioned in [12] between list decodeable 
error correcting codes and witness approximation. In Section 3 we prove that 
given a 3SAT formula it is AfT^-hard to obtain an assignment which agrees with 
any of its satisfying assignments on significantly more than 1/2 of the variables. 
This result combines techniques from [5,12]. In Sections 4 and 5 we extend 
this result to the maximum CLIQUE, and minimum CHROMATIC NUMBER 
problems. In Section 6 we extend our hardness results to the remaining problems 
in Karp’s [11] list of A/”7^-complete problems. 

2 List Decodeable ECC and Witness Approximation 

Let dist{x,y) be the normalized Hamming distance between two vectors x and y 
of equal length, i.e. the fraction of characters which differ in x and y. Note that 
0 < dist{x,y) < 1. An [n, fc,(i]q error correcting code is a function C defined as 
follows. 

Definition 1 (Error Correcting Code - ECC). Let E he an alphabet of size 
q. An [n, k, d]q error correcting code is a function C : A" with the property 

that for every a,b & we have that dist{C{a),C{b)) is greater than or equal to 

d. 



Given an [n,k,d\q error correcting code C and a word c € A", consider the 
problem of finding all codewords in C that are close to c. This problem is a 
generalization of the standard decoding problem, and is denoted as list decoding 
([14,8,9]). We say that an [n, /c,(i]q error correcting code C is list decodeable, if 
there is an efficient procedure that given a word c G A" produces a short list 
containing all words a G E^ for which C(a) is close to c. 

Definition 2 (List Decodeable ECC). Let A be an alphabet of size q. An 
[n,k,d]q error correcting code C is 6 list decodeable if there exists a Turing ma- 
chine D (the list decoder) which on input c G A” outputs in poly(n) time a list 
containing all words a G E^ that satisfy dist{C{a),c) < 6. 

In [9] the following theorem regarding list decodeable error correcting codes 
over finite fields is proven. 
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Theorem 1 ([9]). For any finite field Tq of size q and any constant £ > 0 



there exists an 
decodeable. 



1 , k, (l — ^) ECC which is (l — |) 



list 



In the remainder of this section we review the work of [12] which establishes 
a connection between the existence of list decodeable ECC and the hardness of 
finding approximate witnesses for general MV relations. 

Definition 3 (Approximation within Hamming Distance). Given a poly- 
nomial time relation TZ and an instance x we say that w' approximates a wit- 
ness for X within Hamming distance S if there is a witness w (in the sense that 
(w,x) GTZ) for which dist{w',w) < S. 

Kumar and Sivakumar [12] show that for every A/”7^-complete language L 
there is a formulation via a relation TZl for which it is AfT^-hard to approximate 
a witness within Hamming distance significantly better than 1/2. Let TZl be a 
polynomial relation so that L = {x G {0,1}* ] G {0,1}* s.t TZl{w,x) holds}. 
Given x, finding a witness w so that TZl{w,x) holds is AfT^-hard. Let C be the 
list decodeable error correcting code of Theorem 1 (over GF 2 ), and let be the 
relation {(u),a;) ] s.t w = C{w) and TZl{w,x) holds}. Note that the language 
L is equal to {x \ 3u; s.t 'Rl{w,x) holds}. 

Claim 2 ([12]). Unless MV = V there is no polynomial-time algorithm that 
given x of size n produces w' that approximates a witness for TZl within Flam- 
ming distance 1/2 — e for some e = 

Proof. Assume the existence of a polynomial time algorithm A that given x 
produces w' as in the claim. Apply the list decoding algorithm for C on w' to 
compute a list W (of polynomial length). By definition oITZl, there exists w GW 
such that (w,x) G TZl- As finding such a witness w is A/”7^-hard the assertion is 
proven. □ 



3 Witness Inapproximability of SAT 

In this section we prove the hardness of finding approximate satisfying assign- 
ments for Boolean formulas. Namely, we show that given a satisfiable Boolean 
formula (j), it is AfT^-hard to find an assignment x to the variables of (j) which is 
of Hamming distance significantly less than 1/2 to some satisfying assignment 
of 4>. We present a proof for 3CSP formulas and conclude in a simple reduction 
which extends this result to 3SAT formulas as well. This proof combines tech- 
niques from [5,12]. In the proof below and the remaining sections of our work, 
the parameter s is set to be 1/n'^ for some constant c > 0, where n is the witness 
size of the problem considered. During our reductions it might be the case that 
the witness size changes from n to some polynomial in n. In such cases the con- 
stant c above should also change. We do not state this change explicitly, and the 
notation e should be treated as l/n° for some constant c > 0 where the constant 
c may differ from one appearance of e to another. 
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Consider the search version of the 3CSP (3SAT) problem in which we are 
given a formula </> where each clause is some Boolean function (disjunction) over 
three variables and we wish to obtain a satisfying assignment for cj). 

Claim 3. Approximating satisfying assignments for 3SAT (and 3 CSV) formulas 
within Hamming distance 1/2 — e is MV-hard. 

Proof. We start by proving the hardness of approximating a satisfying assign- 
ment for 3CSP formulas within Hamming distance less than 1/2. Let TZ = 
{(tu, </>o) I 3a s.t. w = C(a) and (/o(a) is true} where cfo is a SAT formula. Denote 
the size of a witness w to (fo in 71 by n. We follow the proof paradigm described 
in the introduction. We use the relation 71 above as the base of the reduction, 
present a core instance </>i with the property that some variables of </>i are hard 
to approximate, and then amplify these hard variables (by duplicating them) to 
obtain our final formula 4 > 2 - 

Using Claim 2, we have that unless MV = V there is no polynomial-time 
algorithm that given 4>q produces w' that approximates a witness for 4>q in TZ 
within Hamming distance 1/2 — £. 

Core: Using any standard reduction (e.g.. Cook’s theorem) transform TZ into a 
3SAT formula 4>\. Denote the variables of 4>\ as {ici, . . . , w„; zi, . . . , Zi}, where 
the value of rui , . . . , Wn in a satisfying assignment to </>i represent a witness w to 
4>o in TZ, the variables zi, . . . ,zi are auxiliary variables added by the reduction, 
and I = poly(n) . We have that an assignment a to </o is a satisfying assignment 
iff there exists an assignment to z such that 4>i{C{a), z) = 1. I.e. 4>\ checks that 
C{a) is a codeword and that </o(a) = 1- 

Amplification: Let m = poly(^). Since I = poly(n) it holds that m = poly(n) 
as well. Construct a 3CSP formula 4>2 by duplicating the variables w\, . . . ,Wn 
so that the number of auxiliary variables Z\, . . . ,zi is small with respect to the 
total number of variables in (f> 2 . 

Define (j) 2 {wi , . . . , w„; w}, . . . ,w(f; zi,. , Zi) to be 

n n m— 1 

(j>2 = ... ,Wn\ Zl, Zl) A l\^{Wi = wl) A l\^ l\^ (wl = 

i=l i=l j=l 

Assume that one can efficiently find an assignment x that approximates a 
satisfying assignment for (f )2 within Hamming distance 1/2 — £. For j = 1 . . . m let 

be the restriction of x over the variables wf . . . wf. As x is within Hamming 
distance 1/2 — £ of a satisfying assignment to 4 > 2 , we conclude by definition of (f >2 
that there exists some j such that x^ is within Hamming distance 1/2 — £ from 
C{a) where a is an assignment satisfying <fo (we have neglected the auxiliary 
variables Zi, . . . ,zi as I is significantly smaller than m) . This contradicts the 
witness inapproximability of the relation TZ above, and concludes our assertion 
regarding 3CSP formulas. 

By translating clauses of the form (wj = in (j )2 to (w( V A (w( V 

we obtain a 3SAT formula with the desired inapproximability properties 
as well. □ 
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4 Witness Inapproximability of CLIQUE 

In this section we consider the search version of CLIQUE in which we are given 
a graph G = {V,E) and we wish to obtain an indicator vector x € {0, so 
that € V\xv = 1} induces a maximum clique in G. 

Claim 4. Approximating the indicator vector of the maximum CLIQUE within 
Hamming distance 1/2 — e is MV-hard. 

Proof. Let ^ be a 3SAT formula for which it is JfV hard to approximate a 
satisfying assignment beyond Hamming distance 1/2 — e (Claim 3). Given </> we 
present a core instance H, and then amplify certain parts of H. 

Core: Let H be the graph obtained by the reduction from 3SAT to CLIQUE 
described in [4]. For every clause in (j) create a set of vertices corresponding to 
assignments to the clause variables that set the clause to true. I.e. clauses of the 
form (^1 V ^2 V /s) yield seven vertices, each corresponding to an assignment that 
satisfies the clause. Connect by an edge every two consistent vertices, i.e. vertices 
that correspond to non-contradicting assignments to their respective variables. 
It is not hard to verify that any satisfying assignment for (j) corresponds to a 
maximum clique in H , and vice versa. 

Amplification: Let t/i . . . be the variables of (p and m = poly(n). For each 
variable yi of (p add 2 new sets of vertices to H. A set of m vertices Ki = 
{v} . . . V™} corresponding to a true truth value for the variable yi, and a set of 
TO vertices Ki = {v} . . corresponding to a false truth value for yi. Let V 
be the vertex set obtained by this addition. Let G = (V,E) where the edge set 
E consists of an edge between every two consistent vertices of V (note that E 
includes the original edge set ol H). 

Let X be an indicator vector of some maximum clique in G. A maximum 
clique in G corresponds to a satisfying assignment of p. Specifically, given a 
satisfying assignment a to p, the set of vertices in G consistent with a form a 
max;imum clique in G, and each maximum clique in G is of this nature. Note 
that for each i this set of vertices includes one of the sets Ki or Ki. 

Let x' be an indicator vector which approximates the vector x within Ham- 
ming distance 1/2 — e. Using an averaging argument similar to the one used in 
Claim 3, we conclude that one of the following hold. There exists some index 
j such that the values of the vector x restricted to the vertices v{ . . .vf agrees 
with the value of x' restricted to these vertices on at least a 1/2 -|- e fraction of 
their values, or such an agreement exists on the vertices v{ . . .vf. As the values 
of X restricted to v{ .. .vf or v/ .. .vf correspond to a satisfying assignment a of 
p (or its bit wise inverse) we conclude our claim. □ 

5 Witness Inapproximability of CHROMATIC NUMBER 

Given a graph G = (U, E), the CHROMATIC NUMBER problem is the problem 
of finding a coloring vector a £ {!,..., such that for all {u, v) £ E it holds 
that cr„ yf ay and k is of minimum cardinality. 
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Claim 5. Approximating the coloring vector of a 3 colorable graph within Ham- 
ming distance 2/3 — e is MV — hard. 

Proof. Let C be the error correcting code over S = {0,1,2} implied by Theo- 
rem 1. Recall that C is <5 = 2/3 — e list decodeable. Let (fo be a 3SAT formula, 
TZ = {(tu, (po) I 3a s.t. C(a) = w and 4>o{a) = true}, and n be the size of a witness 
w to 4>o in TZ. As in Claim 2, given (pQ it can be seen that one cannot approxi- 
mate a witness for TZ within Hamming distance 2/3 — e (unless MV = V). Let 
T : {0,1,2} ^ {0,1}^ be the mapping which associates with any character in 
{0, 1, 2} its binary representation and denote the component-wise image of C(a) 
under this mapping as r(C(a)). Let TZ = {(g, cpo) I s.t. r(w) = y and TZ{w, /'o)} 
be a binary representation of TZ. Using any standard reduction (e.g.. Cook’s the- 
orem) transform 7^ in to a 3SAT formula <p\. Denote the variables of p\ as 
{t/]*, j/{, . . . , j/^; zi, . . . , Zi} where each pair y^,y} in a satisfying assignment 
to pi represents the image of a single character of C(a) under r for some satis- 
fying assignment a oi pQ. The variables Zj are auxiliary variables added by the 
reduction, and I — poly(n). The formula pi is reduced to a graph G = (V,E) 
as follows. 

Core: Using a standard reduction from 3SAT to 3-coloring (for details see 

[3]), we define a graph H which corresponds to the formula pi. This reduction 
guarantees that the graph H is 3-colorable iff the formula pi is satisfiable. Fur- 
thermore, in such a case an assignment satisfying pi can be obtained from a 
3-coloring of H by setting the truth value of the variables in pi to be equal to 
the colors of their corresponding vertices in H. In this reduction, these vertices 
will be colored by one of two colors. We denote these colors as 1 and 0 and the 
remaining color as 2. 

Translation: Let yf , yj be the two vertices in H corresponding to the two 
variables j// , yj in pi . Let cr be a valid 3-coloring of H which corresponds to a 
satisfying assignment to the formula pi and let 6i = T~^{y^,yj) be the value 
of the binary representation of the colors assigned to in a. Note that 

9i G {0, 1,2}. We would like to add a new vertex to H which, given such a 
coloring, must be colored in the color 6i. Such a vertex will translate the Boolean 
colors assigned to y^,y} into a corresponding color in {0, 1,2}. This translation 
will be later used in our proof. We construct the gadget (a) in Figure 1 for each 
pair y^^y]. This gadget receives as input the bits y/ and y] and fixes the value 
of w] to 9i. The components of gadget (a) are the gadgets (i),(ii) and (Hi) pre- 
sented in Figure 1. In these gadgets we assume that some vertices are colored 
by a specific color. The ability to do so is yet another property of the reduction 
used to construct the core graph H. 

Amplification: For each i, we now add m = poly(n) copies of each vertex w\ 
using gadget (b) of Figure 1. Denote the resulting graph by G. Note that in any 
valid 3-coloring of G the colors of the vertices w( for j = 1, . . . , to are identical. 

Let po be a satisfiable formula, G be the graph above, and cr be a coloring 
vector of a valid 3-coloring of G. By our definition of pi and G, if we set 
w G {0,1,2}” to be the value of a restricted to the entries corresponding to 
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Fig. 1. Translation and amplification of the core graph H. Gadget (a) uses gadgets 
(i), (it) and (iii) in order to set the value of w\ to be equal to the binary representation 
of and y\. Gadget (i) sets w} to be 1 whenever = 0 and yl = 1 otherwise it does 
not restrict w}. Similarly, gadget (ii) sets w} to be 0 whenever y^ — 0 and yj = 0 
and gadget (iii) sets wj to be 2 whenever yf = l and yl = 0. Gadget (h) adds m new 
copies of Wi and makes sure that they are colored by the same color. 



the vertices . . . ,w)^, we obtain a witness for (/>q in TZ. By following the line 
of proof given in Claim 4, we conclude that approximating the coloring vector a 
within Hamming distance 2/3 — £ yields an approximation of a witness w to 4>q 
in the relation TZ within Hamming distance 2/3 — £. Detailed proof is omitted. 

□ 



6 Remaining Problems 

In the following we consider the search version of the remaining problems from 
Karp’s list of 21 A/”7^-complete decision problems. 

Given a graph G = (V,E) the INDEPENDENT SET problem is the prob- 
lem of finding an indicator vector x so that {u G V\xy = 1} is a maximum 
independent set in G. The VERTEX COVER problem is the problem of finding 
an indicator vector x so that {u G V\xy = 1} is a minimum vertex cover in G. 

Claim 6. Approximating the indicator vector of the INDEPENDENT SET and 
VERTEX COVER problems within Hamming distance 1/2 — s is N'P-hard. 

Proof Inapproximability of INDEPENDENT SET and VERTEX COVER fol- 
low from the inapproximability of CLIQUE by standard reductions. For INDE- 
PENDENT SET, given a graph G, construct the complement graph G. The 
indicator vector corresponding to the maximum clique in Q is identical to the 
indicator vector representing the maximum independent set in G. For VERTEX 
COVER, given a graph G, the indicator vector corresponding to the maximum 
independent set in G is the bitwise complement of the indicator vector corre- 
sponding to the minimum vertex cover of G. □ 
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Fig. 2. The reduced form of an edge {u, v) from the original graph G. The numbers 
appearing by each edge represent its multiplicity. The parameter m is some polynomial 
in the size of G. 



Given a directed graph G = {V, E) the FEEDBACK EDGE SET problem is 
the problem of finding an indicator vector x G {0, such that {e G E \ Xe = 
1} is a minimum subset of edges which intersects every directed cycle in El. We 
present a hardness result for the FEEDBACK EDGE SET problem in which 
we allow the given graph H to have parallel edges. We do not know whether a 
similar hardness result holds if parallel edges are not allowed. 

Claim 7. Approximating the indicator vector for the FEEDBACK EDGE SET 
within Hamming distance 1/2 — e is MV-hard. 

Proof. We enhance the standard reduction from VERTEX COVER to FEED- 
BACK EDGE SET presented in [11] by adding an amplification gadget. Let 
G = {V,E) be a given instance of the VERTEX COVER problem in which 
jVj = n, and H be the reduced instance for the FEEDBACK EDGE SET prob- 
lem. Each vertex v in G is transformed into two vertices vi and V 2 in H . These 
vertices are connected by a set of directed edges, m = poly(n) edges from v\ to 
V 2 and m -I- 1 edges from V 2 to v\. In addition each undirected edge {u,v) in G 
is transformed into 6 edges in H, 3 edges of the form (wi, M 2 ) and three edges of 
the form (ui,V 2 ). A fraction of the graph H corresponding to one original edge 
{u, v) of G is presented in Figure 2. 

It can be seen that a minimum feedback edge set in G consists of the edges 
(vi,V 2 ) or the edges (v 2 ,vi) only (that is crossing edges of the form (ui,V 2 ) will 
not appear in a minimum solution). Furthermore, for each pair of vertices vi, V 2 
in H the m -I- 1 edges (v 2 , vi ) appear in a minimum feedback edge set in H iff the 
vertex v appears in a minimum vertex cover of G. Using this correspondence, 
our claim is proven using techniques similar to those presented in Claim 4 and 
Claim 5. Detailed proof is omitted. □ 

Let G = (V, E) be an undirected (directed) graph. The UNDIRECTED (DI- 
RECTED) HAMILTONIAN CYCLE problem is the problem of finding an in- 
dicator vector X of edges in E comprising a directed (undirected) Hamiltonian 
cycle. 

Claim 8. Approximating the indicator vector of the DIRECTED HAMILTO- 
NIAN CYCLE and HAMILTONIAN CYCLE within Hamming distance 1/2 — e 
and 2/5 — e respectively is MV-hard. 
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Fig. 3. (a) The original ‘cover testing gadget’, (b) The gadget replacing the vertex 
1) from (a) with m = 4. 



Proof. We use the standard reduction presented in [6] from the VERTEX COVER 
problem on a given graph G of vertex size n, to the DIRECTED HAMILTO- 
NIAN CYCLE problem, and amplify some of the ‘cover testing gadgets’ used 
in this reduction. Using the notation from [6], a single cover testing gadget is 
presented in Figure 3(a). In the reduction of [6], a vertex cover in the original 
graph G is obtained from a Hamiltonian cycle in the reduced graph by setting 
the vertex u in G to be in the vertex cover if the Hamiltonian cycle in the reduced 
graph enters the vertex (w,e„[i], 1) from a vertex ai (for some i). On the other 
hand, if the Hamiltonian cycle in the reduced graph enters (v, e„[i], 1) from the 
vertex c then in the original graph G the vertex v is not included in the vertex 
cover. In order to amplify this difference, we replace each vertex (u, in 

the original cover testing gadget by the additional gadget of size m = poly(n) 
presented in Figure 3(b). 

It can be seen that if the Hamiltonian cycle enters the vertex (v, e„[i] , 1) from 
Gi on its way to the vertex b in Figure 3(a) then it must use the edges pointing 
right in Figure 3(b). Otherwise, the Hamiltonian path must enter (v, e^,[i] , 1) from 
the vertex c in Figure 3(a). In this case the edges pointing left in Figure 3(b) 
will be used in order to reach the vertex b from c. 

Using the above it can be seen that approximating the indicator vector of the 
DIRECTED HAMILTONIAN CYCLE within Hamming distance 1/2 -e: is NV- 
hard. By replacing the gadget in Figure 3/&/ by a different undirected gadget, one 
may achieve the hardness result of 2/5 — £ for the undirected HAMILTONIAN 
CYCLE problem. Possibly, a hardness result of 1/2 — £ can also be proven for the 
undirected HAMILTONIAN CYCLE problem, though we have not been able to 
do so. □ 

The remaining problems in Karp’s list of 21 A/”7^-complete decision prob- 
lems are : 3-DIMENSIONAL MATCHING (3DM), STEINER TREE, CLIQUE 
COVER, EXACT COVER, SET PACKING, SET COVER, HITTING SET, 
PARTITION, KNAPSACK, INTEGER PROGRAMMING, JOB SEQUENC- 
ING, MAX CUT, and FEEDBACK NODE SET. 

For all the above problems one can define natural witnesses, and prove tight 
witness inapproximability results by slight adjustments to the standard reduc- 
tions appearing in [11,6,7]. Full proofs will appear in a future version of our 
paper. 
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Abstract. We consider geometric instances of the problem of finding 
a set of k vertices in a complete graph with nonnegative edge weights. 
In particular, we present algorithmic results for the case where vertices 
are represented by points in d-dimensional space, and edge weights cor- 
respond to rectilinear distances. This problem can be considered as a 
facility location problem, where the objective is to “disperse” a number 
of facilities, i.e., select a given number of locations from a discrete set of 
candidates, such that the average distance between selected locations is 
maximized. Problems of this type have been considered before, with the 
best result being an approximation algorithm with performance ratio 2. 
For the case where k is fixed, we establish a linear-time algorithm that 
hnds an optimal solution. For the case where k is part of the input, we 
present a polynomial-time approximation scheme. 



1 Introduction 

A common problem in the area of facility location is the selection of a given 
number of k locations from a set P of n feasible positions, such that the selected 
set has optimal distance properties. A natural objective function is the maxi- 
mization of the average distance between selected points; dispersion problems of 
this type come into play whenever we want to minimize interference between the 
corresponding facilities. Examples include oil storage tanks, ammunition dumps, 
nuclear power plants, hazardous waste sites, and fast- food outlets (see [12,4]). 
In the latter paper the problem is called the Remote Clique problem. 

Formally, problems of this type can be described as follows: given a graph 
G = (V,E) with n vertices, and non-negative edge weights Wvi,v 2 = d{vi,V 2 )- 
Given k e {2,...,n}, find a subset S C V with |S'| = k, such that w{S) := 
^{vi maximized. (Here, E{S) denotes the edge set of the 

subgraph of G induced by the vertex set S.) 

From a graph theoretic point of view, this problem has been called a heaviest 
subgraph problem. Being a weighted version of a generalization of the problem 
of deciding the existence of a k-clique, i.e., a complete subgraph with k vertices, 

* Research partially supported by NSERC. 



K. Jansen and S. Khuller (Eds.): APPROX 2000, LNCS 1913, pp. 132—141, 2000. 
© Springer- Verlag Berlin Heidelberg 2000 




Maximum Dispersion and Geometric Maximum Weight Cliques 



133 



the problem is strongly NP-hard [14]. It should be noted that Hastad [9] showed 
that the problem Clique of maximizing the cardinality of a set of vertices with 
a maximum possible number of edges is in general hard to approximate within 
For the heaviest subgraph problem, we want to maximize the number of 
edges for a set of vertices of given cardinality, so Hastad’s result does not imply 
an immediate performance bound. 



Related Work 

Over recent years, there have been a number of approximation algorithms for var- 
ious subproblems of this type. Feige and Seltser [7] have studied the graph prob- 
lem (i.e., edge weights are 0 or 1) and showed how to find in time i 

a k-set S C V with w{S) > (1 — e) ^ 2 )’ provided that a fc-clique exists. They 

also gave evidence that for k ~ semidefinite programming fails to distin- 

guish between graphs that have a /c-clique, and graphs with densest fc-subgraphs 
having average degree less than log n. 

Kortsarz and Peleg [10] describe a polynomial algorithm with performance 
guarantee for the general case where edge weights do not have to obey 

the triangle inequality. A newer algorithm by Feige, Kortsarz, and Peleg [6] gives 
an approximation ratio of logn). For the case where k = f?(n), Asahiro, 

Iwama, Tamaki, and Tokuyama [3] give a greedy constant factor approximation, 
while Srivastav and Wolf [13] use semidefinite programming for improved perfor- 
mance bounds. For the case of dense graphs (i.e., \E\ = and k = f2(n), 

Arora, Karger, and Karpinski [1] give a polynomial time approximation scheme. 
On the other hand, Asahiro, Hassin, and Iwama [2] show that deciding the ex- 
istence of a “slightly dense” subgraph, i.e., an induced subgraph on k vertices 
that has at least edges, is NP-complete. They also showed it is NP- 

complete to decide whether a graph with e edges has an induced subgraph on k 
vertices that has ^(1 + 0(n®“^)) edges; the latter is only slightly larger than 

^(1 — ) 1 which is the the average number of edges in a subgraph with k 

vertices. 

For the case where edge weights fulfill the triangle inequality, Ravi, Rosen- 
krantz, and Tayi [12] give a heuristic with time complexity O(n^) and prove 
that it guarantees a performance bound of 4. (See Tamir [15] with reference to 
this paper.) Hassin, Rubinstein, and Tamir [8] give a different heuristic with 
time complexity 0{n^ + k^ log k) with performance bound 2. On a related note, 
see Chandra and Halldorsson [4], who study a number of different remoteness 
measures for the subset k, including total edge weight w{S). If the graph from 
which a subset of size k is to be selected is a tree, Tamir [14] shows that an 
optimal weight subset can be determined in 0{nk) time. 

In many important cases there is even more known about the set of edge 
weights than just the validity of triangle inequality. This is the case when the 
vertex set V corresponds to a point set P in geometric space, and distances 
between vertices are induced by geometric distances between points. Given the 
practical motivation for considering the problem, it is quite natural to consider 
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geometric instances of this type. In fact, it was shown by Ravi, Rosenkrantz, 
and Tayi in [12] that for the case of Euclidean distances in two-dimensional 
space, it is possible to achieve performance bounds that are arbitrarily close to 
7 t/ 2 « 1.57. For other metrics, however, the best performance guarantee is the 
factor 2 by [8]. 

An important application of our problem is data sampling and clustering, 
where points are to be selected from a large more-dimensional set. Different 
metric dimensions of a data point describe different metric properties of a cor- 
responding item. Since these properties are not geometrically related, distances 
are typically not evaluated by Euclidean distances. Instead, some weighted L\ 
metric is used. (See Erkut [5].) For data sampling, a set of points is to be se- 
lected that has high average distance. For clustering, a given set of points is to 
be subdivided into k clusters, such that points from the same cluster are close 
together, while points from different clusters are far apart. If we do the clustering 
by choosing k center points, and assigning any point to its nearest cluster center, 
we have to consider the same problem of finding a set of center points with large 
average distance, which is equivalent to finding a fc-clique with maximum total 
edge weight. 



Main Results 

In this paper, we consider point sets P in d-dimensional space, where d is some 
constant. For the most part, distances are measured according to the rectilinear 
“Manhattan” norm Li. 

Our results include the following: 

• A linear time (0(n)) algorithm to solve the problem to optimality in case 
where k is some fixed constant. This is in contrast to the case of Euclidean 
distances, where there is a well-known lower bound of I2(nlogn) in the 
computation tree model for determining the diameter of a planar point set, 
i.e., the special case d = 2 and k = 2 (see [11]). 

• A polynomial time approximation scheme for the case where k is not fixed. 
This method can be applied for arbitrary fixed dimension d. For the case 
of Euclidean distances in two-dimensional space, it implies a performance 
bound of -\/2 -|- £, for any given e > 0. 

2 Preliminaries 

For the most part of this paper, all points are assumed to be points in the plane. 
Higher-dimensional spaces will be discussed in the end. Distances are measured 
using the Li norm, unless noted otherwise. The x- and ^-coordinates of a point 
p are denoted by Xp and Pp. If p and q are two points in the plane, then the 
distance between p and q is d{p,q) = \xp — Xq\ + \yp — yq\. We say that q is 
above p in direction (a, b) if the angle between the vector in direction (a, b) and 
the vector q — p is less than tt/2, i.e., if the inner product {q — p, (a, b)) of the 
vector q — p and the vector (a, b) is positive. We say that a point p is maximal 
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in direction (a, b) with respect to a set of points P if no point in P is above p 
in direction (a, 6). For example, if p is an element of a set of points P and p 
has a maximal y-coordinate, then p is maximal in direction (0,1) with respect to 
P, and a point p with minimal a;-coordinate is maximal in direction (-1,0) with 
respect to P. If the set P is clear from the context, we simply state that p is 
maximal in direction (a, b). 

The weight of a set of points P is the sum of the distances between all pairs 
of points in this set, and is denoted by w{P). Similarly, w{P,Q) denotes the 
total sum of distances between two sets P and Q. For Li distances, Wx{P) and 
Wx{P,Q) denote the sum of x-distances within P, or between P and Q. 



3 Cliques of Fixed Size 

Let S = {so, si, . . . , Sfc_i} be a maximal weight subset of P, where k is a, fixed 
integer greater than 1. We will label the x- and y-coordinates of a point s € S 
by some {xa,yb) with 0 < a < fc and 0 < b < k such that xq < xi < . . . < Xk-i 
and yo < yi < ■ ■ • < Vk-i- So 

w{S) = (%'“ 2 /*)- 

0<i<j<k 0<i<j<k 

Now we can use local optimality to reduce the family of subsets that we need 
to consider: 

Lemma 1. There is a maximal weight subset S' of P of cardinality k, such that 
each point in S' is maximal in direction (2f -|- 1 — k, 2j + l — k) with respect to 
P — S' for some values of i and j with 0 < i, j < k. 

Proof. Consider an optimal subset S' C P of cardinality k. Let Si be a point in 
S such that there are k — i — 1 points si = {xi, yi) € S \ {sj} with Xf, > Xi (i.e., 
“strictly to the right” of p), and i points se = {xe,ye) G S \ {si} with xe < xi 
(i.e., “to the left” of p). Similarly, assume that there are k—j — 1 points in S\{sj} 
“strictly above “ Si, and j points in S\{si} “below” Si. Now consider replacing Si 
by a point s' = Sj -I- {hx, hy). This adds hx to the cc-distances between Si and the 
i points to the left of Sj, while subtracting no more than hx from the x-distances 
between Si and the k — i — 1 points to the right of Sj. In the balance, replacing 
Si by s' adds at least {2i — k + T)hx to the total x-distance. Similarly, we get 
an addition to the total y-distance of at least {2j — k + l)hy. If s' is above Si in 
direction (2t — fc -I- I, 2j — /c -|- I), the overall change is positive, and the claim 
follows. 

□ 

Theorem 1. Given a constant value for k, a maximum weight subset S of a set 
of n points P, such that S has cardinality k, can be found in linear time. 

Proof. Consider all directions of the form (2i-|- 1 — A:, 2j-|- 1 — fc) with 0 < i,j < k. 
For each direction (a, &), find Sk{a,b), a set of k points that are maximal in 
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direction (a, b) with respect to P — Sk{a, b). Compute the set USk{a, b) and try 
all possible subsets of size k of this set until a subset of maximal weight is found. 

Correctness follows from the fact that Lemma 1 implies that S C US'fc(o, 6). 
Since fc is a constant, each set Sk{a,b) can be found in linear time. Since the 
cardinality of LlSk{a, b) is less than or equal to k^, the result follows. 

□ 

Note that in the above estimate, we did not try to squeeze the constants in 
the 0{n) running time. A closer look shows that for k = 2, not more than 2 
subsets of P need to be evaluated for possible optimality, for fc = 3, 8 subsets 
are sufficient. 



4 Cliques of Variable Size 

In this section we consider the scenario where k is not fixed, i.e., part of the 
input. We show that there is a polynomial time approximation scheme (PTAS), 
i.e., for any fixed positive e, there is a polynomial approximation algorithm that 
finds a solution that is within (1 + e) of the optimum. 

The basic idea is to use a suitable subset of m coordinates that subdivide 
an optimal solution into subsets of equal cardinality. More precisely, we find (by 
enumeration) a subdivision of an optimal solution into m x m rectangular cells 
Cij, each of which must contain a specific number kij of selected points. From 
each cell Cij, the points are selected in a way that guarantees that the total 
distance to all other cells except for the m — 1 cells in the same “horizontal” 
strip or the m — 1 cells in the same “vertical” strip is maximized. As it turns 
out, this can be done in a way that the total neglected distance within the strips 
is bounded by a fraction of (5m — 9)/(2(m — l)(m — 2)) of the weight of an 
optimal solution, yielding the desired approximation property. See Figure 1 for 
the overall picture. 

For ease of presentation we assume that fc is a multiple of m and m > 2. 
Approximation algorithms for other values of k can be constructed in a similar 
fashion and will be treated in the full paper. Consider an optimal solution of k 
points, denoted by OPT. Furthermore consider a division of the plane by a set 
of m + 1 x-coordinates Co < ■ • • < Ci < Cm- Let Xi := {p = (x,y) € 3?^ | C* < 
X < Ci+i) 0 < f < m} be the vertical strip between coordinates Ci and Ci+i- By 
enumeration of possible choices of Co) • ■ • ;Cm we may assume that the Ci have 
the property that, for an optimal solution, from each of the m strips Xi precisely 
k/m points of P are chosen. 

In a similar manner, suppose we know m+I ^-coordinates rjo < rji < ... < rjm 
such that from each horizontal strip Yi := {p = (x,y) G \ rji < y < ryi+i, 0 < 
i < m} a subset of k/m points are chosen for an optimal solution. 

Let Cij := Xi DYj, and let kij be the number of points in OPT that are 
chosen from C^. Since J2o<i<m^v = I^o<i<m % = assume by 

enumeration over the possible partitions oi k/m into m pieces that we know all 
the numbers kij. 

Finally, define the vector Vij := ((2i-|-l — m)A:/m, (2j-|-l — m)fc/m). Now our 
approximation algorithm is as follows: from each cell Cij, choose the kij points 
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Fig. 1. Subdividing the plane into cells. 



that are maximal in direction Vij. (Overlap between the selections from different 
cells is avoided by proceeding in lexicographic order of cells, and choosing the 
kij points among the candidates that are still unselected.) Let HEUhe the point 
set selected in this way. 

It is clear that HEU can be computed in polynomial time. We will proceed by 
a series of lemmas to determine how well w{HEU) approximates w{OPT). In the 
following, we consider the distances involving points from a particular cell Cij. 
Let HEUij be the set of hj points that are selected from Cij by the heuristic, 
and let OPTij be a set of kij points of an optimal solution that are attributed 
to Cij. Let Sij = OPTij n HEUij. Furthermore we define Sij = HEUij \ OPTij, 
and Sij = OPTij \ HEUij. Let HEUi,, OPTi,, HEU,j and OPT,j be the set of 
k/m points selected from Xi and Yj by the heuristic and an optimal algorithm 
respectively. Finally := HEU\HEU,„ TlWU,j := HEU\HEU,j, OPT,, := 

OPT\ OPT,, and OPT,j := OPT\ OPT,j. 

Lemma 2. 

w,{HEU,j,HEU„) + Wy{HEUij,HEU,j) 

> w^{OPT,j,OPT„) + Wy{OPT,j,OPT,j). 

Proof. Consider a point p G Sij . Thus, there is a point p' G Sij that was chosen 
by the heuristic instead of p. Now we can argue like in Lemma 1: Let h = 
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{hx, hy) = p' — p. When replacing p in OPT by p', we increase the x-distance 
to the ik/m points “left” of by hx, while decreasing the cc-distance to (m — 
i — l)k/m points “right” of Cij by hx- In the balance, this yields a change of 
{{2i + l — m)k/m)hx- Similarly, we get a change of {{2j + l — m)k/m)hy for the y- 
coordinates. By definition, we have assumed that the inner product {h, V ij) > 0, 
so the overall change of distances is nonnegative. 

Performing these replacements for all points in OPT\HEU, we can transform 
OPT to HEU, while increasing the sum Wx{OPTij, OPTi,) + Wy{OPTij, OPT,j) 
to the sum Wx{HEUij, HEUi,) + Wy{HEUij, HEU,j). □ 

In the following three lemmas we show that the total difference between the 
weight of an optimal solution w{OPT) and total value of all the right hand 
sides (when summed over i) of the inequality in Lemma 2 is a small fraction of 
w{OPT). 



Lemma 3. 



^ Wx{OPT„) < 

0<i<m— 1 



Wx{OPT) 
2{m - 2) ■ 



Proof. Let Si = — fi. Since i{m — i—l)>m — 2 for 0 < f < m — 1, we have 

for 0 < i < m — 1 



Wx{OPT^,) < 2 ^ 2 ^* — 



ik {m — i — l)k ^ 
m m 



1 

2(m-2)' 



Since OPT has ik/m and (m — i — l)k/m points to the left of fi and right of 
respectively, we have 



so 



Wx(OPT) > 



ik (m — i — l)k 
m m 



Y, Wx{OPT„) < 

0<i<m— 1 



1 



2{m — 



^WxiOPT). 



□ 



Lemma 4. For i = 0 and i = m — 1 we have 



Wx{OPTi,) < 



WxjOPT) 
m — 1 



Proof. Without loss of generality assume z = 0. Let xo,xi, - ■ ■ , X(^kjm)-i be the 
x-coordinates of the points po,pi, . . . ,P(k/m)-i in OPTq,. So 

k k 

Wx{OPTo,) = ( l)(a;jL_i - xo) + ( 3 )(xjl_ 2 - xi) + ... 

777, m m, rn 



<(--l)(Ci-^o) + (--3)(6-xi) + 
m m 

< —{fl- Xq) + —{fi- Xi)+ ... 

m m 
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Since — Xj < x — Xj where 0 < j < k/m and x is the x-coordinate of any 
point in OPTq, and since there are (m — l)k/m points in OPTq,, we have 

m 



(.1 - Xj < 



so 



k m 



Wx{pj, OP To,), 



Wx{OPTo,) < 



m {m — l)k 



OP To,) 



0<i<^ 



< — ^ V] w^ipi, OPTo,) 

m — < ^ 



< 



m — 1 
1 

m — 1 
1 

m — 1 



w^iOPTo,, OPTo,) 
w^iOPT). 



This proves the main properties. Now we only have to combine the above 
estimates to get an overall performance bound: 



Lemma 5. 



^ w^{OPT,„OPT,,)+ Y, Wy{OPT,j,OPT,j) 

0<i<m 0<j<m 

> (1 - ,, ^"1,7^ M OPm 

2[m — l)[m — 2) 

Proof. From Lemmas 3 and 4 we derive that 
^ w,{OPT„) < 

0<i<m 

and similarly 

^ Wy{OPT,y) < 

0<i<m 

Since 

w{OPT) = w,^{OPT)+Wy{OPT) 

= Y w^{OPT„,'OPT„) + Y w.{OPT„) 

0<i<m 0<i<m 

+ ^ Wy{OPT,j,OPf„)+ Y ^y(OPT,y), 

0<j<m 0<j<m 

the result follows. □ 

Putting together Lemma 2 and the error estimate from Lemma 5, the ap- 
proximation theorem can now be proven. 



5m — 9 

2(m — l)(m — 2) 



.{OPT) 



5m — 9 



2(m — l)(m — 2) 



Wy{OPT). 
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Theorem 2. For any fixed m, HEU can he computed in polynomial time, and 

flTT? — Q 

w{HEU) > (1 - — —)w{OPT). 

Proof. The claim about the running time is clear. Using Lemmas 2 and 5 we 
derive 

w{HEU)> ^ w,,{HEU,„TMj^,) + ^ Wy{HEU,j,HEU,j) 

0<i<m 0<j<m 

> ^ Wy{OPT,j,W^,j) 

0<2<m 

> (1 _ _J!^_L_)^(Opt). 

2(m — l)(m — 2) 

a 



5 Implications 



It is straightforward to modify our above arguments to point sets under Li 
distances in an arbitrary d-dimensional space, with fixed d. 

Theorem 3. Given a constant value for k and d, the maximal weight subset S 
of a set of n points in d-dimensional space, such that S has cardinality k, can be 
found in linear time. If d and e are constants, but k is not fixed, then there is a 
polynomial time algorithm that finds a subset whose weight is within (1 + e) of 
the optimum. 



Furthermore, we can use the approximation scheme from the previous section 
to get a + e) approximation factor for the case of Euclidean distances in 
two-dimensional space, for any £ > 0: In polynomial time, find a fc-set Si such 
that Li{S) is within (1 -|- e) of an optimal solution OPTi with respect to Li 
distances. Let OPT 2 be an optimal solution with respect to L 2 distances. Then 



L2{S) > 4=Li(5') > LiiOPTi) > 

^ ^ ’-V2{l + s) ^ ’-V2{l + s) 



Li{OPT2) 



> 



1 



'/2(1 s) 



L2{0PT2), 



and the claim follows. 



6 Conclusions 

We have presented algorithms for geometric instances of the maximum weighted 
/c-clique problem. Our results give a dramatic improvement over the previous 
best approximation factor of 2 that was presented in [12] for the case of general 
metric spaces. This underlines the observation that geometry can help to get 
better algorithms for problems from combinatorial optimization. 

Furthermore, the algorithms in [12] give better performance for Euclidean 
metric than for Manhattan distances. We correct this anomaly by showing that 
among problems involving geometric distances, the rectilinear metric may allow 
better algorithms than the Euclidean metric. 
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Abstract. We study the online page replication problem. We present 
a new randomized algorithm for rings which is 2.37297-competitive, im- 
proving the best previous result of 3.16396. We also show that no ran- 
domized algorithm is better than 1.75037-competitive on the ring; pre- 
viously, only a 1.58198 bound for a single edge was known. We extend 
the problem in several new directions: continuous metrics, variable size 
requests, and replication before service. Finally, we give simplihed proofs 
of several known results. 



1 Introduction 

Paging and caching are some of the most fundamental and practical problems 
in computer science. The advent of the world wide web has brought about a 
new wave of interest in such problems, as caching strategies have the potential 
to greatly decrease web page access times. We study one of the most basic, but 
surprisingly under-investigated, paging problems: The online page replication 
problem. 

In this problem, processors are connected by a network, and pages can be 
stored at each processor. The processors might be the processors of a multi- 
processor computer architecture, connected by a bus; or web browsers, connected 
via the WWW. If processor v wants to access an item of a page which is con- 
tained in its own memory, the access is free of cost. Otherwise, the data must 
be transmitted from another processor w which has the page in its memory. 
The cost of this access is proportional to the distance between v and w in the 
network. It is also possible to copy, replicate, the entire page — but this is much 
more expensive. However, if many requests occur at the same node, replication 
might pay off in the long run. 

Unfortunately, for any particular page, requests appear one at a time in 
unpredictable locations, so we must decide online (i.e., without knowledge of 

* This work was done while the authors were at the Max-Planck-Institute for Com- 
puter Science, Saarbriicken. Both authors were partially supported by the EU ES- 
PRIT LTR Project No. 20244 (ALCOM-IT), WP 3.2. The first author was also 
supported by a Habilitation Scholarship of the German Research Eoundation (DEG). 



K. Jansen and S. Khuller (Eds.): APPROX 2000, LNCS 1913, pp. 144—154, 2000. 
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future request patterns) to which processors a page should be replicated. This is 
known as the online page replication problem (PRP) which we investigate in this 
paper within the framework of competitive analysis [11,12]. An online algorithm 
is said to be c-competitive if for all problem instances the cost incurred by the 
algorithm is at most c times the cost incurred by an optimal offline algorithm 
(i.e., an algorithm with full knowledge of the future) on that same problem 
instance. 

For arbitrary networks, PRP is equivalent to the online Steiner tree prob- 
lem [7] whose competitive ratio is 6>(log n) [15,2], where n is the number of nodes 
in the network. However, most practical processor networks have a simple struc- 
ture and therefore most previous research on PRP has focused on tree or ring 
topologies. Tree networks are fairly well understood [1], however, the situation 
for rings is not as good. It is often the case that an online problem on a tree 
is much easier than for other networks. Therefore, the investigation of online 
problems on non-tree networks, such as the ring, is an important pursuit. 

Contributions of this Paper: Our most important contribution is a better 
understanding of PRP on rings. We provide the first randomized lower bound 
for rings (1.75037) which takes full advantage of the ring topology (beating 
the edge lower bound of 1.58198 [1]), and greatly improve the randomized up- 
per bound from 3.16396 [1] to 2.37297. The upper bound is a new application 
of an important and elegant technique: probabilistic approximation of metric 
spaces [18,4,6,5]. 

It is possible to extend the basic definition of PRP in several directions, which 
may prove to be closer to actual application. Towards this end we investigate 
several new variants of PRP: PRP on continuous rings and trees, PRP with 
varying request sizes, and PRP when replication is allowed before request service 
(as opposed to after in the normal definition). 

Another important contribution of our work is to provide simpler analysis of 
the work that has gone before. The previously mentioned continuous PRP model 
proves to be quite valuable in helping us understand certain randomized algo- 
rithms. Essentially, any algorithm for the continuous model yields a randomized 
algorithm for the discrete model with equal competitive ratio. For trees, this 
is an exact correspondence, i.e. any randomized tree algorithm implies a deter- 
ministic algorithm for the continuous model with equal competitive ratio. The 
notation of unfairness , which has proven to be valuable in other contexts [22,6], 
also proves itself here. 

As we are unable to present an exposition of all of these results in the space 
available, we provide details only on our randomized bounds, in Sections 3 and 4, 
and algorithms for the continuous model, in Section 5. We give a detailed list of 
the other results in Section 6. We begin by giving some formal definitions in the 
next section. 
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2 Definitions 

An instance of PRP is defined formally as follows: We are given a metric space 
M = (P,D), a point s G P and a positive integer d. The point s is called the 
origin, which is the point that initially contains the page, d is called the page 
replication factor. An antipode is a point at maximal distance from the origin. 
When considering tree metrics, we consider the origin to be the root of the tree. 

We receive a sequence of requests a, to points in P. For each request u, we 
must pick a point v, which already has the page, and a path p from v to u. We 
serve the request by transmitting the required data along p. We are given the 
option to replicate along p, in which case every point on p will be given the page. 
If we choose to replicate, we pay d ■ length{p), whereas if we do not we pay only 
length{p). The entire problem is deciding when to replicate. 

There are several types of adversaries that can be used when one considers 
randomized algorithm [8] . We consider exclusively the oblivious adversary (which 
cannot adapt its request sequence to the random choices of the online algorithm) . 



3 A Randomized Upper Bound for Rings 



We present an algorithm for page replication when the metric space M is a ring, 
which we call Random Cut. We show that it is 2.37297-competitive. The best 
previously known result is a 3.16396-competitive algorithm [1]. 

Without loss of generality we assume throughout this section that the cir- 
cumference of the given ring is 1. Further, we associate each point on the ring 
with a real number in [0, 1). This number is the distance clockwise to the origin 
s. 

The algorithm is based on the Geometric algorithm of Albers and Koga [1] 
for trees, which is T(d)-competitive where 



P{d) 



a 



- 1 ’ 



0 = 1 - 1 - 



1 

dr 



As d goes to infinity this approaches < 1.58198. We use a technique called 
probabilistic approximation of metric spaces to obtain a better algorithm for 
rings. This technique was first used by Karp [18] and further developed and 
used by Bartal et al. [4,6,5]. The idea is to cut the ring at some point u, chosen 
at random. We then run an algorithm for page replication on trees on the tree 
with root s and branches consisting of the clockwise path from s to w and the 
counterclockwise path from s to u, as illustrated in Figure 1. 

The idea of using a random cut point was put forward by Albers and Koga [1]. 
However, they consider only a uniformly distributed cut point, which yields a 
2T((i)-competitive algorithm. 

We denote by Geometric(m) the algorithm which cuts the ring at u and 
runs Geometric on the resulting tree. Precisely, the Random Cut algorithm 
works as follows: 
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Fig. 1. Cutting the ring at u. 



1. Pick u G (0, 1) at random with probability density p(u), where: 



p{u) = 



2. Run Geometric(m). 



q(l — u) if u < i, 
q{u) otherwise. 






We show that Random Cut is |C((i)-competitive for the ring. Note that 



lim 



3F(d) 



3e 



< 2.37297. 



(t— »oo 2 2(e — 1) 

First note that the distribution used is valid, as p{z) > 0 for 0 < z < 1 and 



/ p{z)dz = 1 . 

Jo 



Consider the offline algorithm which is optimal among those offline algo- 
rithms which do not replicate or serve requests across u. We call this algorithm 
Topt(u). Clearly, for any fixed cut point w, we have 

E[cOSt(GEOMETRIC(w),CT)] < r{d) • COSt(TOPT(w), cr), 
for all request sequences a. We prove that 

3 

E[cOSt(TOPT(w), cr)] < - • COSt(OPT, (t), 
for all a. This shows the desired result as 

E[cOSt(RANDOM CUT,ct)] = [ p(w) • E[cOSt (GEOMETRIC (w). O’)] dw 



< r{d) / p(ru) • cost(ToPT('u;), cr)dr(; 
Jo 

= r{d) ■ E[cost(ToPT(w), cr)] 



< r{d) ■ - • cost(OPT, ct). 



for all a. 
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Consider the algorithm opt which serves u with minimal cost. Without loss 
of generality, OPT replicates distance x clockwise, distance 1—y counterclockwise, 
and then serves all requests. Due to symmetry, we need only consider x> 1 — y. 

Define ^ 

P{a, b)= p{z)dz. 

J a 

Based on the choice of the cut point u, there are two possibilities. 

The first is that u lies in that portion of the ring where OPT has replicated. 
Topt(u) pays at most d, as this is the cost of replicating around the entire ring. 
The expected cost incurred by Topt due to this case is at most 

d(P(0,x)+P(y,l)). 

In the other case, u lies in the portion of the ring where OPT has not repli- 
cated. Consider the offline algorithm which replicates exactly as OPT, but does 
not cross u in serving requests. All requests in (x,u) are served from x, while all 
requests in {u,y) are served from y. Certainly, the cost incurred by Topt is at 
most the cost incurred by this algorithm. For 2 e [0, 1) we define (j>{z) to be the 
number of requests at point z. The expected cost incurred by T opt due to this 
case is at most 

d{x+l-y)P{x,y) + ^ 4>{z)[{z - x)P{z,y) + {y - z)P{x, z)]. 

x<z<y 

The total cost is therefore 

E[cost(ToPT(w), cr)] = d(P(0, x) + P{y, 1)) -I- d{x -I- 1 — y)P{x, y) 

+ - 2;)P(z,y) -k (y - z)P(x,z)] 

x<z<y 

= d-d{y- x)P{x,y) 

+ X! 4>i.z)[{z - x)P{z,y) + {y - z)P{x,z)]. 

x<z<y 

The optimal offline solution replicates clockwise to point x and connterclock- 
wise to point y paying at least d{x+ 1 — y) for this. Requests in (x, y) are served 
from the closest endpoint. Define m = {x + y)/2. In terms of <j), x and y the total 
optimal offline cost is 

d{x + l-y)+ ^ (j){z){z-x)+ ^ </>(z)(y-z). 

x<z<m m<z<y 

To show the desired result we prove that E[cost(ToPT('u), cr)] — |-cost(OPT) < 
0. We start by rewriting this as follows: 

E[cOSt(TOPT(M), cr)] — I • COSt(OPT) 

= d - d{y - x)P{x,y) + ^ (j){z)[{z - x)P{z,y) + {y - z)P{x, z)] 

x<z<y 
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d{x+l-y)+ (j){z){z-x)+ Y 

x<z<m m<z<y 

= d f{x, y) + Y y, z) + Y 4>{z)h{x,y,z), 

x<z<m m<z<y 

where 

f{x, y) = ^{y - x) - {y - x)P{x, y) - 
g{x, y, z) = {z- x)P{z, y) + {y - z)P{x, z) - ^{z - x), 

Hx,y,z) = {z- x)P{z,y) + {y- z)P{x,z) - |(j/ - z). 

To complete the proof we show the following three lemmas: 

Lemma 1. f{x, y) <0 for all 0 < x < y < 1 and x > 1 — y. 

Lemma 2. g{x, y,z) <0 for all0<x<z<y< 1 and x > I — y. 

Lemma 3. h{x, y,z) <0 for all0<x<z<y<l and x >1 — y. 

The proofs are purely algebraic, and are omitted due to space considerations. 

It is also possible to prove that the result given here is the best possible of its 
type. Specifically, we are able to prove that for any e > 0 there is no distribution 
over cut points u such that 

3 

E[cOSt(TOPX(M), cr)] < (- — e)cost(OPT, cr) 
for all a. Again, the proof is left for the full version. 

4 A Randomized Lower Bound for Rings 





Fig. 2. The ring C 4 . 



Prior to this work, no randomized lower bound specific to the ring metric 
was known. The lower bound of Albers and Koga [1] for a single edge carries 
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over to the ring, but a better bound is possible: We show that no randomized 
algorithm can be better than 



4e 



, 10+1 



max 



o<™<! 4e™+i - 2e“ - 2ew 



> 1.75037 



( 1 ) 



competitive, even on a 4-node ring. The construction makes use of Yao’s corollary 
to the von Neumann minimax principle [24,25]. This principle states that, for 
a given problem, one can show a lower bound cost of c for any randomized 
algorithm by showing a distribution over inputs such that the expected cost to 
any deterministic algorithm is c. 

We use the graph Ca illustrated in Figure 2 and a distribution over request 
sequences of the following form: 

— Sequence a{£) consists of i requests at node t. 

— Sequence a{i,x),x G {u,v} consists of £ requests at node t followed by d 
requests at node x. 



The value of £ falls in the range 1 < ^ < 2d. We define pe to be the proba- 
bility that a{£) is given and qi to be the probability that a{£, u) is given. The 
probability of cr{£,v) is the same as that of a{£,u). Define 

1 



0=1-1- 



d-r 



/3 = 2d++'^ - 



The distribution has a positive integer parameter k. Given k and d we use: 

2£a^+'^-^ 



Pi = 



qt = 



(d-l)/3’ 

{d + £)> 






for 1 < £ <k, 



iov k < £ < d, 



q2d = 



2{d-l)P ’ 
da^ 

All other values of pi and qi are 0. We are able to show that the competitiveness 
of any deterministic algorithm on this distribution is at least 

da'^{Aa^ + 1 ) 



c(d, k) = 



2(3 



Since the adversary chooses k, max^ c(d, k) is a lower bound for any randomized 
algorithm. As d goes to infinity, this approaches (1), which can be seen as follows: 



lim max 

d—^OC) k 



da‘^{4a’^ + 1 ) 



= lim max ■ 






[wd\ 



1 ) 



d^oo W 4oL“‘^J+^ - 2+dJo;‘^/d- 2aL“‘^J 

lim^^oo 4o -I- o'* 



= max 



lim^^oo 4oL“’‘^J+'^ — 2[wd\a‘^ /d — 2a; 
4e“+i -k e 



de^'+i - 26*" - 2ew' 

The details of the proof are quite technical, and shall be presented in the full 



version. 
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5 Continuous Metrics 

There are situations where any point on an edge of the network can be requested 
or store the object. Consider the nomads’ problem where the network is a road 
map and the object is water. If a tribe of nomads temporarily settles at some 
place in the desert, the tribesmen need water. This could mean a daily walk to 
the nearest well. However, this need could also be satisfied by building a pipeline 
from the well to the settlement. If the nomads plan to stay for a longer period 
of time, this might be a good investment. Of course, the pipeline can be built 
in phases, each phase covering part of the distance. The nomads problem is an 
example of PRP on a continuous metric space. 

The prior work on PRP considered only discrete metric spaces. We present 
the first results on continuous metrics. While this may a priori seem unmoti- 
vated, we find an interesting relationship between deterministic algorithms for 
continuous metrics, and randomized algorithms for discrete metrics (with the 
same topology). 

We also introduce the notion of unfair PRP, which again may seem unmoti- 
vated, but which also proves to be quite valuable. Let a > 1 be a real number. 
In a-unfair PRP, the algorithm has replication factor ad, while the adversary 
has replication factor d. If a = 1 then we have the traditional PRP problem, 
which we distinguish as fair PRP. 

We give deterministic algorithms for continuous trees and rings which are 

-competitive and l)-competitive, respectively. These results give 

us vastly simplified proofs of the corresponding results for randomized algorithms 
on discrete metrics due to Albers and Koga [1] and Glazek [13]. 

We begin by considering a very simple situation: a-unfair PRP on a single 
continuous edge {s,t). For the time being, we restrict the adversary to request 
points only at t. Without loss of generality the length of the edge is 1. 

We consider an algorithm which we call Geometric based on the algorithms 
of the same name in [1,20]. Geometric balances its cost such that it always 
achieves the same competitive ratio, independent of the length of the request 
sequence. At the k-th request, it replicates to the point at distance 

O'” - 1 



from s, where 9 = 1+ Note that oq = 0 and ad = 1. For 1 < fc < d. 
Geometric must pay 

0d 

1 • (1 - Ofc-i) -I- ad ■ {ak - ak-i) = Qd _ i 

for serving the k-th request and replicating to ak ■ Since this cost is constant for all 
requests, it is also equal to the competitive ratio of Geometric (the adversary 
replicates to t if and only if there are at least d requests). The optimality of 
Geometric also follows directly from this property. 
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While this result on its own may seem unimpressive, it allows us to derive 
algorithms for both continuous trees and continuous rings, where the adversary 
may request any point: 

Theorem 4. If there is a c- competitive algorithm for a single (discrete or con- 
tinuous) edge then there is a c-competitive algorithm for (discrete or continuous) 
trees. 

Proof. (Sketch) We handle all edges of the tree independently. In particular, if 
a node is requested, it recursively also puts a request at its parent in the tree. 

In the discrete case this implies that whenever a node wants to replicate the 
page from its parent in the tree the parent must already have the page. 

In the continuous case, we run a virtual algorithm which assumes that the 
endpoints of all edges which are nearer to s have the page right from the begin- 
ning. Then, on any path to the root, we have an alternating sequence of empty 
and full edge segments. This configuration can now easily be transformed into 
a legal configuration by pushing all full edge segments into the direction of the 
root. This configuration is then realized by the algorithm. □ 

Theorem 5. If there exists a c-competitive algorithm for the 2-unfair PRP prob- 
lem on continuous trees, then there exists a c-competitive algorithm for fair PRP 
on the continuous ring. 

Proof. (Sketch) The continuous ring is divided into two halves: one running 
clockwise from the origin to the antipode, and one running counterclockwise. 
We collapse these two halves into a single line running from the origin to the 
antipode. We run our algorithm for the 2-unfair PRP problem on this line, 
treating all requests as if they occurred on it. We maintain the invariant that 
if the algorithm has replicated distance x from the origin on the line, then in 
reality we have replicated distance x in both directions from the origin on the 
ring. The (unfair) cost the algorithm incurs on the line is exactly the same as 
the (fair) cost on the ring, while the adversary’s cost on the ring is at least as 
great as that on the line. □ 

We remark that this result extends to discrete rings that are symmetric. 

As we have mentioned, these results imply those given in [1] and [13] for 
randomized algorithms (the crucial observation is that deterministic algorithms 
for the continuous edge (or tree) correspond naturally to randomized algorithms 
for the discrete edge (or tree), as we have seen above; in particular, randomization 
does not help on the continuous edge or tree). We ask the reader to note the 
simplicity of our argument relative to those given in [1,13]. 

6 Further Results and Conclusions 

We have given new randomized upper and lower bounds for online page replica- 
tion, and presented a new and more elegant way to prove several old results. Our 
proofs make use of general techniques which where developed for other problems. 
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We have a number of other results on PRP which shall be given in the full 
version: 

1. We consider a variation on PRP where replication is allowed before the 
request is served (as opposed to after). In general, upper and lower bounds 
for both models approach the same limiting value as d goes to infinity. Those 
in the before model approach the limit from below, while those in the after 
model approach from above. However, there is no simple reduction between 
the models. 

2. We consider weighted requests. 

3. We give simplified proofs of the the lower bounds of Glazek [14] of 2.31023 
for the continuous ring and 2.36603 for the 4-node uniform discrete ring. 

4. We give a simplified proof of the upper bound of Glazek [14] of 2.36604 for 
the 4-node uniform discrete ring. 
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Abstract. We prove hardness results for approximating set splitting 
problems and also instances of satisfiability problems which have no 
“mixed” clauses, i.e., every clause has either all its literals unnegated 
or all of them negated. Results of Hastad [9] imply tight hardness results 
for set splitting when all sets have size exactly k > 4 elements and also 
for non-mixed satisfiability problems with exactly k literals in each clause 
for k > 4. We consider the case k = 3. For the Max E3-Set Splitting 
problem in which all sets have size exactly 3, we prove an NP-hardness 
result for approximating within any factor better than 19/20. This re- 
sult holds even for satisfiable instances of Max E3-Set Splitting, and 
is based on a PCP construction due to Hastad [9] . For “non mixed Max 
3Sat”, we give a PCP construction which is a variant of one in [8] and 
use it to prove the NP-hardness of approximating within a factor better 
than 11/12, and also a hardness factor of 15/16 -I- e (for any e > 0) for 
the version where each clause has exactly 3 literals (as opposed to up to 
3 literals). 



1 Introduction 

We study the approximability of set splitting problems and satisfiability prob- 
lems whose clauses are restricted to have either all literals unnegated or all of 
them negated. The latter seems to be a natural variant of the fundamental sat- 
isfiability problem. 



1.1 Set Splitting Problems 

We first discuss the set splitting problems we consider and the prior work on 
them. In the general Max Set Splitting problem, we are given a universe U 
and a family T of subsets of [/, and the goal is to find a partition of U into two 
(not necessarily equal sized) sets as U = UiU U2 that maximizes the number of 
subsets in F that are split (where a set S' C [7 is said to be split by the partition 
U = U1UU2 ii S fMJi ^ ^ and S fMl2 yf 0 ). The version when all subsets in the 
family T are of size exactly k is referred to as Max E/c-Set Splitting. For any 
fixed k > 2 , Max E/c-Set Splitting was shown to be NP-hard by Lovasz [13]. 
Obviously, Max E2-Set Splitting is exactly the extensively studied Max Cut 
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problem. Max Cut is known to be NP-hard to approximate within 16/17+ e, 
for any £ > 0 [9,16], and Goemans and Williamson, in a major breakthrough, 
used semidefinite programming to give a factor 0.878-approximation algorithm 
for Max Cut [5].^ Here we investigate the approximability of Max E/c-Set 
Splitting for k > 3. 

The Max Set Splitting problem is related to the constraint satisfaction 
problem Max NAE Sat, which is a variant of Max Sat, but where the goal 
is to maximize the total weight of the clauses that contain both true and false 
literals. Max Set Splitting is simply a special case of Max NAE Sat where 
all literals appear unnegated (i.e.. Max Set Splitting is the same problem 
as monotone Max NAE Sat). Similarly Max E/c-Set Splitting is just the 
monotone version of Max NAE-E/c-Sat, and in fact is the same problem as 
Max 2-COLORABLE hypergraph on /c-uniform hypergraphs (i.e., given a k- 
uniform hypergraph, find a 2-coloring of its vertices that maximizes the number 
of hyperedges which are not monochromatic). 

Prior Work. We discuss below the status of the Max E/c-Set Splitting and 
Max NAE-E/c-Sat problems for k > 3. These problems are all NP-hard and 
MAX SNP-hard. Moreover, it was shown that for each k > 3, there is a constant 
£k > 0, such that is is NP-hard to distinguish between Max Efc-SET Splitting 
instances where all sets can be split by some partition (which we call satisfiable 
instances in the sequel), and those where no partition splits more than a (1 — £fc) 
fraction of the sets [14] . 

Following the striking inapproximability results of Hastad [9], it has become 
possible to prove reasonable explicit bounds on the inapproximability ratios of 
Max E/c-Set Splitting and Max NAE-E^-Sat by construction of appropri- 
ate gadgets (see [16] for formal definitions of gadgets; we freely use this terminol- 
ogy throughout the paper). In particular, the result for A: = 2 (i.e.. Max Cut) 
mentioned above follows this approach, and so does the 1 1/12 + £ hardness result 
for Max NAE-E2-Sat [9]. In the same paper [9], Hastad proved a tight inap- 
proximability bound of 1 — 2“^ + £, for an arbitrary constant e > 0, for satisfiable 
instances of Max fc-SAT, for fc > 3. It follows that even satisfiable instances of 
Max NAE-E/c-Sat, for fc > 4, are hard to approximate within 1 — 2“^+^ + e, 
for an arbitrary constant £ > 0. Note that this result is tight, since a random 
truth assignment will “satisfy” a fraction 1 — 2“*+^ of the clauses of a Max 
NAE-EA:-Sat instance. This leaves only the case k = 3, and this turns out to 
be an intriguing case where a tight result is not yet in sight. We now review the 
results that are known for k = 3. 

For Max NAE-E3-Sat, a hardness of approximation within 15/16 + £, even 
for satisfiable instances, follows from Hastad’s inapproximability result for Max 
3-Sat and an easy 2-gadget from Max 3-Sat to Max NAE-E3-Sat [14,17]. On 
the algorithmic side, the best known approximation algorithm, due to Zwick [18], 
achieves a ratio of 0.908. (This bound is as yet only based on numerical evi- 

^ Throughout this paper we deal only with maximization problems and by an a factor 
approximation we mean a solution whose value is at least a times that of an optimum 
solution. Consequently, all factors of approximation we discuss will be less than 1. 
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dence, the best proven bound is 0.87868 [11,17], which is slightly better than 
the Goemans and Williamson approximation guarantee for Max Cut [5].) For 
satisfiable instances of Max NAE-E3-Sat, an approximation ratio of 0.91226 
can be achieved in polynomial time [17]. 

Turning again to set splitting problems, Hastad [9] has shown the tight result 
that it is NP-hard to approximate (even satisfiable instances of) Max E4-Set 
Splitting within 7/8 + £, for any e > 0. For Max E3-Set Splitting, by the 
results of Zwick [17,18] mentioned above, there exist approximation algorithms 
achieving a ratio of 0.908 (resp. 0.912) for general (resp. satisfiable) instances. 
As regards hardness results for Max E3-Set Splitting, no explicit bound 
in the literature appears to be correct. Inapproximability within a factor of 
(approximately) 0.987 is claimed in [10]; and it is mentioned in [1] that the 9- 
gadget reducing a PCq constraint to 3-Set Splitting constraints that appears in 
[10] (a PCo constraint is of the form x 0 j/ 0 z = 0 where x, y, z are unnegated 
variables), together with Hastad’s inapproximability result for Max 3-Parity, 
implies a hardness of 17/18 + s for Max E3-Set Splitting. These claims 
suffer from the “well-known” flaw in early gadget results that use PCq gadgets 
without giving explicit PCi gadgets (a PCi constraint checks if x 0 j/ 0 z = 1) 
to conclude hardness results for approximating monotone constraint satisfaction 
problems (like Max Cut, Max E3-Set Splitting, etc). The problem is that 
when the target problem is monotone, one cannot “convert” a PCi constraint to 
a PCo constraint by simply negating a variable, and one has to pay an explicit 
cost in the gadget for negating a variable. This error occurs in early versions of 
[2] and in [10,1]. For the case of Max Cut, the error can be (and was) fixed in 
[2] who construct a PCi gadget from a PCq gadget by negating a variable at a 
unit extra cost. One can similarly fix the error for Max E3-Set Splitting by 
incurring an extra cost of 4 for the PC\ gadget, and this gives a hardness result 
for approximating Max E3-Set Splitting to better than a factor 21/22 (this 
result is reported formally in an earlier version of this paper [6]). 

In light of the work of Trevisan et al. [16] on methods for finding optimal 
gadgets, it is natural to ask why we cannot use their techniques to search for 
and find the optimal gadgets for set splitting. It turns out that it is not possible 
to guarantee an optimal gadget by the means in [16], because 3-Set splitting 
constraints are not hereditary (a constraint family is hereditary if identifying 
two variables of a function results in a new function that is either in the family 
or is the all 0 or all 1 function; see [16] for details). The linear programs involved 
in getting the best gadget in even some reasonable subclass of gadgets are too 
big to be solved^, and it is probably not worthwhile to pursue this approach as 
one is not guaranteed a proof of optimality of the gadget anyway. This makes 
the question of pin-pointing the approximability of Max E3-Set Splitting all 
the more intriguing. 

Remark. It is easy to see that Max 2-Sat reduces to Max NAE-E3-Sat, 
and that Max E3-Set Splitting reduces to Max Cut in an approximation 
preserving way. However, the best algorithm known for Max 2-Sat [4] achieves 

This was pointed out to us by Greg Sorkin. 



2 




158 



Venkatesan Guruswami 



a ratio of 0.931, while only a weaker ratio of 0.908 [18] is known for Max NAE- 
E3-Sat. Similarly, while the best approximation ratio to date for Max Cut is 
0.878 [5], a better factor of 0.908 is known for Max E3-Set Splitting (using 
the same algorithm as the one for Max NAE-E3-Sat [18]). 



1.2 Satisfiability with No Mixed Clauses 

The set splitting problem is just a special case of Nae-Sat where one adds the 
restriction that in every clause all literals appear unnegated (or, equivalently, 
all appear negated), i.e. no clause is “mixed”. This leads us to consider the 
corresponding question for the even more fundamental problem of satisfiability 
with the restriction that none of the clauses in the instance are “mixed” . We 
refer to the version of Max Sat where all clauses have at most k literals and 
none of the clauses have both negated and unnegated literals as Max fc-NM- 
Sat (here NM-Sat stands for non-mixed satisfiability). The version where all 
the clauses have exactly k literals will be referred to as Max E/c-NM-Sat. This 
problem appears to be a fairly natural variant of Sat, and does not appear to 
have been explicitly considered in the literature. 

Known Results on Approximating Max /c-NM-Sat: Clearly, any algorithm that 
approximates Max /c-Sat within factor ak also approximates Max fc-NM-SAT 
within the same factor; in particular approximation factors of «2 = 0.931 and 
«3 = 7/8 can be achieved in this way [4,11]. For Max Efc-NM-SAT, an approxi- 
mation factor of 1 — 2~^ can be achieved trivially for all fc > 3, by simply picking 
a random truth assignment. There are no algorithms known which perform any 
better on non-mixed clauses than on general satisfiability instances. For k > 4, 
a recent result of Hastad [9] shows that Max E/c-Set Splitting is hard to ap- 
proximate within a factor of 1 — 2“^+^ -I- e for any £ > 0. Since there is a trivial 
2-gadget reducing Max E/c-Set Splitting to Max E/c-NM-Sat (namely, re- 
place the constraint split(xi, X 2 , . . . , Xk) by the two clauses (xi Vx 2 V- • -Vxk) and 
(xi V X 2 V • • • V Xfc)), this implies that Max E/c-NM-Sat (and hence also Max 
/c-NM-Sat) is NP-hard to approximate within a factor better than (1 — 2“^). 
Hence, for fc > 4, the naive algorithms that work for the more general Max 
E fc-SAT are really the best possible for Max Efc-NM-SAT as well. As in the 
case of set splitting problems, our focus, therefore, is on the case fc = 3. 



1.3 Our Main Results 

For Max E3-Set Splitting, we prove that for every £ > 0, it is NP-hard to 
find a partition that splits more than a 19/20 -I- £ fraction of the 3-sets of a satis- 
fiable Max E3-Set Splitting instance. This result is proved by demonstrating 
a simple gadget reducing the predicate SNE 4 (x, y, 2 :, ic) =)x ^ y) V {z ^ w) to 
3-set splitting constraints. This improves the hardness factor of 27/28 -I- £ for ap- 
proximating satisfiable instances of Max E3-Set Splitting that was reported 
in an earlier version of this paper [?], and is in addition a lot simpler. 
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For Max 3-NM-Sat, one can prove an inapproximability ratio of 13/14+eby 
starting with a hard to approximate instance of Max 3-Sat, and use a 2-gadget 
to replace each mixed clause with clauses that only have either all negated or all 
unnegated literals (for example, replace a clause (a V & V c) with (a V & V t) and 
(tVc) and a clause (aV&Vc) with (aVt) and (tV&Vc)). For Max E3-NM-Sat, 
this method gives hardness within a factor of 19/20 -I- e. Both these hardness 
results apply for satisfiable instances of non-mixed Sat as well. 

We improve these results by giving a PCP construction following the one 
in [8]; our PCP makes 3 queries, has perfect completeness and has soundness 
1/2 -|-£. This new PCP construction is essentially the same as the one in [8] - we 
show that, with one simple modification, one of the two proof tables the verifier 
reads in their construction need not be folded (folding is a technical requirement 
in PCP constructions which will be elaborated later in the paper). This modified 
PCP construction enables us to prove a hardness of 11/12 -b £ for Max 3-NM- 
Sat and a hardness of 15/16 -|-£ for Max E3-NM-Sat. Note that no polynomial 
time algorithm with better performance ratio than 7/8 is known for either of 
these problems, and progress in closing this gap remains an open problem. 

2 Hardness of Approximating Max E3-Set Splitting 

Gadgets: A Brief Discussion: The hardness results of this section are proven 
by giving appropriate gadgets reducing constraint satisfaction problems already 
known to be hard to approximate, to Max E3-Set Splitting. We use the def- 
initions of gadgets following [2,12,16]: an a-gadget reducing a boolean function 
/ on variables xi, X2, . . . , to a constraint family IF is a finite collection of (ra- 
tional) weights Wj and constraints Cj from T over xi, X2, . . . , and auxiliary 
variables yi,y 2 , ■ ■ ■ ,Vp such that for each assignment a = oi, 02, . . . , Ofc to the 
Xi’s that satisfies /, there is an assignment b to the yfs such that a total weight 
a of the constraints Cj are satisfied by (a, b), and if a does not satisfy /, then for 
every assignment b to the yfs, the weight of the constraints Cj satisfied by (a, b) 
is at most a — 1 (see [12,16] for further details). The quantity a is a measure of 
the quality of the reduction, a smaller value of a implies a better approximation 
preserving reduction. 

Theorem 1. For any e > 0, it is NP-hard to distinguish between instances of 
Max E3-Set Splitting where all the sets can be split by some partition and 
those where any partition splits at most a 21/22 + e fraction of the sets. 

The starting point for proving the above is the following powerful hardness 
result which follows form a PCP construction by Hastad [ 9 ] (namely the one he 
used to show the hardness result for 4 -set splitting). 

Definition: (Max SNE4) An instance of Max SNE4 consists on Boolean vari- 
ables xi,X2, ■ ■ ■ ,Xn and a collection C of constraints of the form 

SNE4 (xii , Xi^ , Xi3 ,Xi^) = (xii yf ) V (Xi3 yf Xi^ ) 
no negations allowed defined on certain subsets of these variables. 
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Theorem 2 ([9]). For any e > Q, it is NP-hard to distinguish between instances 
of Max SNE 4 where all the constraints can he satisfied by some (boolean) as- 
signment to the variables and where no assignment satisfies more than a 3/4 + e 
fraction of the constraints. In other words, even satisfiable instances of Max 
SNE 4 are NP-hard to approximate within a factor better than 3/4. 

Proof of Theorem 1: The result will be proven by construction of an ap- 
propriate gadget that reduces SNE 4 constraints to 3-Set Splitting constraints. 
Indeed we claim the following is a 5- gadget reducing SNE 4 (a, &, c, d) to 3-set 
splitting contraints: Introduce auxiliary variables x, y, z (specific just to this 
constraint) and replace SNE 4 (a, 6 , c, d) with the five constraints SPLIT (a,b,x), 
SPLIT (a, b, y), SPLIT (c, d, y), SPLIT (c, d, z), and SPLIT {x, y, z) (the constraint 
SPLIT (p,q,r) represents the condition that not all ofp,q,r are equal. 

Firstly when SNE 4 (a, 6 , c, d) is satisfied by some assignment to a,b,c,d; we 
wish to verify that all five constraints in the gadget can be satisfied by an appro- 
priate assignment to the auxiliary variables. Indeed, let us assume that, say, 
(a yf 6 ) without loss of generality (since the gadget is symmetric in {a,b} 
and {c, d}); then the assignment {y = x = c) and x = c satisfies all the five 
3-set splitting constraints. On the other hand, if SNE 4 (a, &, c, d) is not satis- 
fied by some assignment to a, b, c, d, then we have a = b and c = d. Now two 
cases arise: (i) {a = h) ^ {c = d), in which case one of the two constraints 
SPLIT(a, 6 , y) and SPLIT(c, d, y) is violated irrespective of the assignment to y, 
and (ii) a = 6 = c = d, in which case the only way to satisfy all the four con- 
straints SPLIT (a,b,x), SPLIT (a,b,y), SPLIT (c,d,y), SPLIT (c,d,z) is to set 
X = y = z = d and in this case SPLIT {z,y,z) is violated. Thus at most four of 
the five constraints can be satisfied for any assignment to x, y, z if SNE 4 (a, b, c, d) 
is not satisfied. In other words, the gadget above is a 5-gadget. 

Starting from a satisfiable instance of SNE 4 constraints that is hard to ap- 
proximate within 3/4-1- e (as in Theorem 2), this gives a hardness result of ap- 
proximating satisfiable instances of Max E3-Set Splitting to within ^ + |, 
and since e > 0 is arbitrary, this gives the desired result of Theorem 1. □ 



3 Hardness of Max Efc-SET Splitting for fc > 4 

Hastad [9] proves the tight result that even satisfiable instances of Max E4-Set 
Splitting are hard to approximate within any factor better than 7/8. This 
result does not imply a tight hardness result for Max E/c-Set Splitting for 
fc > 5 by just using a gadget, one can, however, easily modify Hastad’s PCP 
construction to work also for Max Efc-SET Splitting for k > 5, and this gives 
the following result: 

Theorem 3 ([9]). For fc > 4, for any e > 0, it is NP-hard to distinguish 
between instances of Max Efc-SET Splitting where all the sets can be split 
by some partition and those where any partition splits at most a 1 — 2“^+^ -I- s 
fraction of the sets. 
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4 Hardness of Approximating Max 3-NM-Sat 

We will prove the following theorems in this section. 

Theorem 4. For any e > Q, it is NP-hard to distinguish between satisfiable 
instances of Max 3-NM-Sat and those where at most a fraction 11/12 + s of 
the clauses can be satisfied. 



Theorem 5. For any s > 0, it is NP-hard to distinguish between satisfiable 
instances o/Max E3-NM-Sat and those where at most a fraction 15/16 + £ of 
the clauses can be satisfied. 

Sketch of Idea: The instances of 3 Sat which are proved hard to approximate 
within a factor of 7/8 + £ have the property that (at least) 3/4 of the clauses 
are “mixed” (i.e., have both positive and negative literals). In proving a hard- 
ness result for, say, Max 3-NM-Sat, one converts these clauses into non-mixed 
clauses using a gadget of some cost (and this method gives a factor 13/14-1- £ 
hardness for Max 3-NM-Sat). In order to obtain an improved hardness bound, 
our approach will be to get a similar hardness result for 3 Sat when the frac- 
tion of mixed clauses is smaller, and then use the same gadget approach to get 
hardness for Max 3-NM-Sat. To this end, we will first prove: 

Theorem 6. For any £ > 0, given a Max E3-Sat instance in which at most 
half the clauses are mixed, it is NP-hard to distinguish between instances which 
are satisfiable and those where every assignment satisfies at most a fraction 
7/8 -\- £ of the clauses. 

Proof: The proof follows from the construction of a PCP system for NP that 
makes 3 (adaptive) queries, has perfect completeness, and a soundness of l/2-|-£ 
for any £ > 0. The hardness for Max E3-Sat as claimed then follows by suitable 
gadgets reducing the constraints checked by the PCP verifier to 3 Sat constraints. 
The PCP will be a simple modification of the one in [8,7]; we will be heavily 
relying on the treatment and terminology of [7]. We provide below a high-level 
description of the PCP construction; while by no means complete, this should 
give some sense of the ideas used in the construction. 

Interlude: PCP constructions follow the paradigm of proof composition. In its 
most modern and convenient to use form, one starts with an outer proof system 
which is a 2-Prover 1 -Round proof system (2PIR) construction for NP due to 
Raz [15]. Raz’s construction works as follows. Given a 3Sat instance, the verifier 
picks u variables at random, and for each variable picks a clause in which it 
occurs at random. The verifier then asks the prover Pi for the truth assignment 
to the u variables and the prover P 2 for the truth assignment to the 3 m variables 
in all the clauses it picked. (We ignore issues like two picked clauses sharing a 
variable as these have o(l) probability of occurring.) The verifier then accepts if 
the assignment given by P 2 satisfies all the clauses it picked and also is consistent 
with the assignment returned by Pi. This requirement can be captured as the 
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answers a and b of P± and P 2 satisfying a “projection” requirement 7 t(6) = a. 
Raz’s parallel-repetition theorem proves that the soundness of this 2P1R goes 
down as c“ for some absolute constant c < 1. 

In the final PCP system, the proof is expected to be the encodings of all 
possible answers of the two provers of the outer 2P1R proof system using some 
suitable error-correcting code. For efficient constructions the code used is the 
long code of [2] . The long code of a string of k bits is simply the value of all the 
2^ fc-ary boolean functions on that string (for example the long code of a fc-bit 
string a is a string A which has one coordinate for each /c-ary boolean function 
/ and the entry of A in coordinate /, denoted A{f) satisfies A{f) = f{a)) . 
The construction of a PCP now reduces to the construction of a good inner 
verifier that given a pair of strings A, B which are purportedly long codes, and a 
projection function tt, checks if these strings are the long codes of two consistent 
strings (as per the projection). Referring the reader to [7] for details, we delve 
into the specification of our inner verifier. [End Interlude] 

The inner verifier is given input an integer u and a projection function tt : 
[7]“ ^ [2]“ and has oracle access to tables A : F[2Y which is foldedf 

(i.e., A{—f) = —A{f) for all functions in P[ 2 \^) and B : P[t]u {1, —1} (which 
is not required to be folded), and aims to check that A (resp. B) is the long code 
of a (resp. b) which satisfy 7r(b) = a. The formal specification of our inner verifier 
{u,tt) is given in Figure 1 (the constant c used in its specification is an 
absolute (small) constant which can be figured out from our proofs). 



Inner Verifier {u,n) 

Choose uniformly at random / £ .7^(2]“, g & 

Choose at random h £ .?^[ 7 ]u such that Vfe £ [7]“, Pr[/i(6) = 1] = p 
if A{f) — 1 then accept iff B{g) yf B{—g{f o tt A h)) 
if A{f) = — 1 then accept iff B{g) yf B{—g{—f o tt A h)) 



Inner Verifier {u,tt) 

Set t = [1/5], £i = 5^ and ei = 

Choose p £ {ei, . . . ,et} uniformly at random 
Run B-MBCp^’® 



Fig. 1. The inner verifier Vp and our final inner verifier IVS^. 



The only difference of this inner verifier from the one in [8] is that the table 
B is not assumed to be folded (this will be critical to our application), and the 
conditions checked are “inequalities” as opposed to “equalities” (i.e., of the form 

® The notation To stands for the space of all functions / : D ^ {1,-1} and [n] stands 
for the set {1, , 2, . . . , n}. We also use {1,-1} for representing boolean values, with 
1 standing for False and —1 for True. 
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X y instead of x = y). Note that this change is clearly necessary, as otherwise 
one could simply set B{g) = 1 Vg, and this will satisfy all checks (this will be a 
valid table as B is not required to be folded). 

It is clear that this inner verifier has perfect completeness, since when A is 
the long code of a, B the long code of b and 7t(6) = a, the inner verifier accepts 
with probability 1. Assume now that we can prove that for some small e > 0 
(which can be made as small as we seek) , the combined PCP verifier constructed 
by composing the standard outer verifier due to Raz with the above inner verifier 
has soundness 1/2 + sA We now claim that this will imply the result claimed 
about Max E3-Sat with at most half the number of clauses being mixed. It 
is easy to see that, for each random choice of the inner verifier, the boolean 
function checked by the inner verifier is of the form 

(aV6Vc)A(aV6Vc)A(dV&Vc')A(dV&Vc?) 

by appropriate identifications of B(g) with b, A(f) with a or d depending upon 
whether the folded table contains an entry for the function / or for its comple- 
ment — /, and suitable identifications B{—g{foTrAh)) and B{—g{—foTrAh)) with 
c, c'. This actually gives a 4-gadget reducing the PCP’s acceptance predicate to 
3Sat constraints. 

Hence the acceptance condition of the PCP can be viewed as a 3Sat instance 
in which at most half the clauses are mixed (note that only two of the four clauses 
in the gadget above are mixed). Since the soundness of the PCP is 1/2 -|- e, 
together with the 4-gadget, this gives a hardness of approximating Max E3-Sat 
with at most half the number of clauses being mixed, as desired. We therefore 
need to bound the soundness of the inner verifier by 1/2 -|- e. The analysis of 
the soundness of the inner verifier follows the same sequence of lemmas as the 
proof for the original inner verifier IV35 in [9,7]. The only change is that we 
must now rework the proofs without assuming that B is folded. The full details 
of the soundness analysis can be found in the full version of this paper [6].® 
□ {Theorem 6) 

Proof of Theorem 4: There is a simple reduction from E3Sat to non-mixed 
3Sat obtained by replacing a clause (a V b V c) with two non-mixed clauses 
(a V6V t) and (tVc) where t is a new variable used only in these two clauses, and 
by similarly replacing a clause (aV bV c) with two non-mixed clauses (a V t) and 
(iVbVc). Starting from a hard instance of E3Sat (3Sat with all clauses having 
exactly three literals) as in Theorem 6, satisfiable instances of E3Sat get mapped 
to satisfiable instances of Max 3-NM-Sat, while instances where at most a 
fraction 7/8 -I- £ of the clauses are satisfiable get mapped to instances of Max 
3-NM-Sat where at most a fraction 11/12 -|- 2e/?> of the clauses are satisfiable. 
Since £ > 0 is arbitrary, the result of Theorem 4 follows. □ ( Theorem /) 

^ The exact definition of soundness of an inner verifier turns out to be a tricky issue, 
we once again refer the reader to [8,7] for the definitions. 

® The results in this paper on set splitting are stronger than the ones claimed in [6], 
but the results on non-mixed Sat are identical in both versions. 




164 



Venkatesan Guruswami 



Proof of Theorem 5: The proof is similar to the above, except now a clause 
(a V 6 V c) is replaced by the three non-mixed clauses (a V & V ti), (a V 6 V ^2) and 
(c V V ^2), and a clause (a V & V c) is replaced by the three non-mixed clauses 
(& V cV ti), (bV cV ^2) and (a V V ^2) where ti,t 2 are new variables used only 
in these three clauses. □ ( Theorem 5) 

5 Concluding Remarks 

The exact approximability of Max E3-Set Splitting remains an intriguing 
open question, though by now the gap between the positive and negative results 
is quite small (there is a 0.908 approximation algorithm, and it is NP-hard to 
get a better than 0.95 approximation). A similar situation exists with Max Cut 
where an 0.878 approximation algorithm exists, while approximating within a 
factor of 0.942 is NP-hard. 

We established the tight result that approximating Max E3-Sat where at 
most half the number of clauses are mixed to within any factor better than 7/8 
is NP-hard. We then used this result to deduce hardness of approximating Max 
3-NM-Sat and Max E3-NM-Sat within factors better than 11/12 and 15/16 
respectively. It is not clear how to algorithmically exploit the non-mixed nature 
of all clauses and devise an algorithm with performance ratio better than 7/8 for 
either of these problems; in fact it is very well possible that these problems are 
hard to approximate within 7/8 -I- £, though an approach which could potentially 
establish such a result has eluded us. 

We close by discussing some interesting questions related to Max E4-Set 
Splitting. If the goal is only to split as many sets as possible, then Hastad’s 
result gives that the best one can do is a factor of 7/8. One can ask, more 
specifically, that we wish to maximize the number of 4-sets which have a 1-3 
split under the partition, and similarly maximize the number of sets that have 
a 2-2 split under the partition. It turns out that the former problem can be 
cast as a system of linear equations over GF(2) each of the form Xi^ 0 Xi^ © 
Xi^ © = 1, and by yet another powerful result in [9], this problem is hard 

to approximate within any factor better than 1/2 (of course picking a random 
partition will satisfy half of these linear constraints, so this result is also tight). 
The 2-2 splitting problem, however, is another instance where a tight result 
is not known. It is easy to prove, using a simple 3-gadget from Max SNE4, 
that this problem is hard to approximate within 11/12 + £ for any £ > 0, even 
on satisfiable instances. The current best approximation algorithm seems to be 
reducing the problem to Max E3-Set Splitting and picking a random solution, 
and returning the better of the two solutions - this achieves an approximation 
ratio of about 0.4035 (which is better than the factor 3/8 achieved by a random 
solution). The current gap between the hardness and algorithms is thus quite 
large (we believe a semidefinite programming based algorithm should be able 
to achieve much better performance for this problem). A significantly better 
hardness result for the 2-2 splitting problem would be interesting, as it might 
give insights on how to improve the inapproximability bounds for Max E3-Set 
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Splitting and possibly even Max Cut - in fact a hardness factor better than 
0.794 for the 2-2 splitting problem would immediately improve the current best 
hardness result for Max Cut. 
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Abstract. We study a network loading problem with applications in lo- 
cal access network design. Given a network, the problem is to route flow 
from several sources to a sink and to install capacity on the edges to sup- 
port flows at minimum cost. Capacity can be purchased only in multiples 
of a fixed quantity. All the flow from a source must be routed in a sin- 
gle path to the sink. This NP-hard problem generalizes the Steiner tree 
problem and also more effectively models the applications traditionally 
formulated as capacitated tree problems. We present an approximation 
algorithm with performance ratio {psT + “ 2 ) where psT is the performance 
ratio of any approximation algorithm for minimum Steiner tree. When 
all sources have the same demand value, the ratio improves to {psT -f 1) 
and in particular, to 2 when all nodes in the graph are sources. 



1 Introduction 

We consider a single-sink multiple-source routing and capacity installation prob- 
lem where capacity can be purchased in multiples of a fixed quantity. In telecom- 
munication network design this corresponds to installing transmission facilities 
such as fiber-optic cables on the edges of a network, and in transportation net- 
works this applies to assigning vehicles of fixed capacity to routes. Topological 
design of communication networks is usually carried in stages due to the com- 
plexity of the problem. One of the fundamental stages is the design of a local 
access network which links the users to a switching center. The problem we study 
models this stage of the planning process. 

Problem Statement. We are given an underlying undirected graph G = 
(V, E), \V\ = n. A subset S of nodes is specified as sources of traffic and a single 
sink t is specified. Each source node Si G S has a positive integer- valued demand 
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denii. All the traffic of each source must be routed to t via a single path, that 
is flow cannot be bifurcated. The edges of G have lengths £ : E ^ Without 
loss of generality, we may assume that for every pair of nodes v,w, we can use 
the shortest-path distance dist{v,w) as the length of the edge between v and 
w; Therefore, we take the metric completion of the given graph and assume all 
edges from the complete graph are available. Capacity must be installed on the 
edges of the network by purchasing one or more copies of a facility, which we 
refer to as the “cable” based on the telecommunication application. The cable 
has per unit length cost c and capacity u. Without loss of generality we can 
assume c = 1. 

The problem is to specify for each source Si, a path to t to route demand 
denii such that cables installed on each edge of the network provide sufficient 
capacity for the flow on the edge, and total cost of cables installed is minimized. 
Notice that we allow paths from different sources to share the capacity on the 
installed cables, the only restriction being that the capacity installed on an edge 
is at least as much as the total demand routed through this edge. 

The problem is NP-hard since the problem with cable capacity large enough 
to hold all of the demand is equivalent to a Steiner tree problem with the sources 
and the sink as the terminal nodes. 

Previous Work. This problem has been studied in the literature as the 
network loading problem, together with its variations such as the multicommod- 
ity and multiple facility cases. For a survey on exact solution methods see the 
chapter on multicommodity capacitated network design by Gendron, Crainic 
and Frangioni in [SS99]. In spite of the recent computational progress, the size 
of the instances that can be solved to optimality in reasonable time is still far 
from the size of real-life instances. 

In this paper we focus on obtaining approximation algorithms. A constant 
factor approximation for this problem was obtained by Salman et al. in [SCR-l-97] 
by applying the method of Mansour and Peleg [MP 94] to the case of single 
sink, and single cable type. The main algorithm of Mansour and Peleg applies to 
the multiple-source multiple-sink single cable problem with approximation ratio 
O(logn) in an n-node graph. By using a Light Approximate Shortest-Path Tree 
(LAST) [KRY 93] instead of a more general-purpose spanner in this algorithm, 
Salman et al. obtained a 7-approximation algorithm for the single-sink version. 
When all the nodes in the input network except the sink node are source nodes, 
the approximation ratio in [SCR-l-97] reduces to (2-\/2 -|- 2). Another constant 
factor approximation algorithm for this problem also follows from the work of 
Andrews and Zhang [AZ98] who gave an 0(fc^)- approximation algorithm for the 
single sink problem with k cable types, but the resulting constant factor is rather 
high. 

Results. In this paper, we improve the approximation ratio to {psT + 2) by 
routing through a network that is built on an approximate Steiner tree, with 
performance ratio psT- The idea is to utilize the Steiner tree when demand is 
low compared to the cable capacity and when demand accumulates to a value 
close to the cable capacity, it is sent directly to the sink. For the special case 




Approximation Algorithms for a Capacitated Network Design Problem 



169 



when demand of each source is uniform, the approximation ratio improves to 
{psT + !)• When all the nodes in the input network except the sink node are 
source nodes, the approximation ratio reduces to 3 with non-uniform demands, 
and to 2, for uniform demands. 

Our study was also motivated by obtaining better approximation algorithms 
for the capacitated MST problem [Pap78,AG88,KB83,CL83,S83]: Given an undi- 
rected edge-weighted graph with a root node and a positive integer u, the prob- 
lem is to find the minimum weight tree such that every subtree hanging off the 
root node has at most u nodes in it. This problem has been cited [KR98,AG88] 
to model the local access network design problem when every non-sink node is 
required to route a single unit of demand to the sink via cables each of capac- 
ity at most u. The requirement that every demand has to send its unit flow 
via a single path is modeled as requiring a tree as the solution. However, if 
routing these demands at nodes is not a concern, we can still enforce the non- 
bifurcating requirement for the demands without requiring that the solution be 
a tree. This reformulation leads exactly to our single cable problem in the uni- 
form case with all nodes being sources. Our 2-approximation algorithm for this 
problem is then a better solution than the best-known 3-approximation [AG88] 
for the corresponding capacitated MST formulation. In the nonuniform demand 
case, our (/95T+2)-approximation is better than the best known 4-approximation 
presented in [AG88] in addition to handling the Steiner version that does not 
require all non-sink nodes be source nodes. 

In the next two sections, we present the algorithms for the case of uniform 
and non-uniform demands, respectively. We close with an extension of the local 
access design problem. 

2 Uniform Demand 

We first present an approximation algorithm for the case when every source has 
the same demand. Without loss of generality, we assume demand equals one for 
each source. 

We can outline the algorithm as follows. First we construct an approximate 
Steiner tree with terminal set S'U{t} and cost dist{e) on each edge e in polynomial 
time. Let T be the approximate Steiner tree with worst-case ratio psT^ ■ Let the 
tree T be rooted at the sink node t. Next, we identify subtrees of T such that 
total demand in a subtree equals the cable capacity u. We then route the total 
demand within a subtree directly to the sink from the node of the subtree closest 
to the sink. The subtrees collected by the algorithm may contain common nodes 
but have disjoint source sets. 

For a formal statement of the algorithm, we need the following definitions. 
Let the level of a node be the number of tree edges on its path to the root. The 
parent of a node v is the node adjacent to it on the path from v to t. For each 
node V, let T„ denote the subtree of T rooted at v and D{Ty) denote the total 

^ The MST is a 2-approximate solution. Better approximation ratios are known, e.g., 
a 1.55-approximation was given recently in [RZOO]. 
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unprocessed demand in Ty. Let R be the set of unprocessed source nodes. Then, 
D{Ty) = derui = \Rf]Ty\. The Algorithm Uniform below outputs a 

routing for the demand from each source to the sink, and the number of cables 
that are installed to support the routing. 

Algorithm Uniform: 

Initialize: R = S 
Main step: 

Pick a node v snch that D{Ty) > u and level of v is maximnm. 

If n = t or D{Tt) < u, then go to the final step. 

Pick a node, say w in RC\Ty such that dist(w, t) is minimum, as a “hub” node. 
Let C = {ui}. 

Collect source nodes in C (Details given below). 

Add edge (w, t) to the network and install one copy of the cable on (w, t). 

Route demand of each source in C to the hub node w via the unique paths in T 
Route demand of C at the hub directly to the sink on {w,t). 

Remove C from R, and set C = 

If R is not empty, repeat the main step. 

If R is empty, go to the final step. 

Final step: 

If i? yf 0, then route all the demand in i? to t via their path in T. 

For all edges e of T, 

Cancel the maximal possible amount of flow of equal value in opposite 
directions such that total flow will not exceed u. 

Install one copy of cable on the edges of T which have positive flow. 

Collect Source Nodes : 

Add V to C, ii V € R. 

Let vi, . . . ,Vk be the children of v. 

If w ^ V, then 

Let Vp be the child of v such that the hub node w is in Ty^. 

Add T„p n i? to C. 

While I Cl < u, 

Pick an unprocessed child of v, say Vi. 

If D(Ty^) + |C| < M, then 
Add Ty. n Rto C. 

Else, {Ty. is collected partially) 

Scan Ty. depth- first. 

Add sources in i? n T„. to C until |C| = u. 

Return C. 



Lemma 1. The algorithm routes demand such that flow on any edge of the tree 
T is at most the cable capacity u. 

Proof. Consider an edge e of T. Let v be the incident node on e with higher level 
(see Figure 1). Flow on e is determined by the total flow coming out of Ty and 
going into T„. Our proof is based on these two claims: 

Claim 1: Total flow going out of Ty is at most u — 1. 
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Fig. 1. Subtree T„ and its children. 



Claim 2: Total flow coming into Ty is at most u — 1. 

To prove claims 1 and 2, we consider two cases based on how the sources in Ty 
are assigned to hub nodes by the algorithm. A partially assigned subtree has at 
least one of its source nodes collected in a set C and has at least one source node 
not in C. 





Fig. 2. Examples of partially and completely assigned subtrees. 



Suppose Ty is partially assigned (see Figure 2). The first time flow goes 
out of Ty, a subtree Ty with u at a smaller level than v is being processed by 
the algorithm. Due to the subtree selection rule, we can conclude that Ty has 
remaining demand strictly less than u. Therefore, total outflow from Ty will be 
at most u — 1. Hence, Claim 1 holds in this case. 

The reason Claim 2 holds is as follows. When there exists an inflow into T„, 
flow is accumulated at a hub node in Ty. Since the algorithm accumulates a 
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flow of exactly u at any hub node, a flow of at most u — 1 will go into The 
algorithm first picks a subtree and a hub node in it, and collects demand starting 
with the subtrees of Ty. Therefore, the algorithm will not collect sources out of 
Ty, unless all the sources in have already been collected. This implies that 
once flow enters Ty, none of the nodes in Ty will become a hub node again and 
hence flow will not enter Ty again. 

Now let us assume that Ty is not partially assigned. Then all the sources in 
Ty are collected in the same set by the algorithm. If these sources are routed 
to a hub node out of the subtree, then outflow is at most m — 1. If the sources 
are routed to a hub node in the subtree, then inflow is at most u — 1. Inflow or 
outflow occurs only once. Thus, Claims 1 and 2 hold in this case, too. 

For any edge of T, flow in one direction does not exceed u, by Claims 1 and 2. 
When there exists flow in both directions in an edge with total value greater than 
u, we cancel flow of equal value in opposite directions such that total flow will 
not exceed u. Cancelling flow will lead to reassigning some of the source nodes 
to hubs. See Figure 3 for an example. 




a) On edge e, sum of flow in both b) Sources are reassigned to hubs after 

directions exceeds u, where u=10. flow of value 5 is cancelled on edge e. 



Fig. 3. An example of cancelling flow and reassigning sources to hub nodes. Here 
wi and W 2 are hub nodes chosen in the order of their indices. 



Theorem 1. There is a (1 + pst) -approximation algorithm for the single-sink 
capacity installation problem with a single cable type and uniform demand. 

Proof. Consider Algorithm Uniform. Let Co pt be the cost of an optimal solution 
and Cheur be the cost of a solution output by the algorithm. Let Cst denote 
the cost of the cables installed on the edges of the approximate Steiner tree 
T. Let Cdr be the cost of cables installed on the direct edges added by the 
algorithm. 
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By Lemma 1, at most one copy of cable is sufficient to accomodate flow on the 
edges of the approximate Steiner tree T . The cost of a Steiner tree with terminal 
set S U {t} is a lower bound on the optimal cost because we must connect the 
nodes in S' to t and install at least one copy of the cable on each connecting 
edge. Therefore, Cst S PstCqpt- 

For a source set Ck collected at iteration k, since \Ck\ = u, the algorithm 
installs one copy of the cable on the shortest direct edge from the subtree T„, 
which contains Ck, to t. The term ' dist{si,t) is a lower bound 

on Copt, since derrii must be routed a distance of at least dist{si,t) and be 
charged at least at the rate 1/u per unit length. (In the uniform demand case, 
derm = 1 for all i.) Since source sets collected by the algorithm are disjoint, 

J2kJ2s,GC^ ^ ■ dist{s„t) = T,kT,s,GCk is a lower bound on Cqpt, 

as well. As demand of a set Ck is sent via the source in Ck that is closest to t 
(the hub node Wk), we get 



dist{wk,t) = min ^ ^ 



Si^Ck 

Thus, we finally have 






SiGCk 



CDR = ''^dist{wk,t) ^ ^ < Copt- 



k k Si^Ck 

Therefore, Cheur = Crt + Cdr < (1 + Pst)Copt- 



( 1 ) 

(2) 



3 Non-uniform Demand 

When source nodes have arbitrary demand, derrii for source Si, it is no longer 
possible to collect sources with total demand exactly equal to the capacity u. If 
we were allowed to split the (integral) demand for any source into single integral 
units each of which can be routed in separate paths to the sink, notice that the 
algorithm of the previous section can be used by expanding each source Si to 
derrii sources connected by zero- length edges in the tree. However, in the more 
general case, all the flow of a source must use the same path to the sink. In 
this case, we modify Algorithm Uniform so that we send demand directly to the 
sink when it accumulates to an amount between u/2 and u. To guarantee that 
we don’t exceed u while collecting demand, we send all sources with demand at 
least u/2 directly at the beginning of the algorithm. 

For a source set C, let dem{C) be the total demand of sources in C. Recall 
that D{C) is the total remaining (unprocessed) demand of C, as defined for 
the uniform demand case. The modified algorithm, which we call Algorithm 
Non-uniform, is as follows. 
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Algorithm Non-uniform: 

Initialize: R = S. 

Preprocessing: (send large demands directly) 

For all sources Si such that derm > it/2, 

Route the demand on 

Install copies of cable on (si,t). 

Remove Si from R. 

Main step: 

Pick a node v such that D{Tv) > u/2 and level of v is maximum. 

If V = t, or D{Tt) < m/ 2, then go to the final step. 

Pick a node, say w in i? n such that dist{w,t) is minimum, as a “hub” node. 
Let C = {m;}. 

Collect source nodes in C (Details given below). 

Add edge (w,t) to the network and install one copy of the cable on (w,t). 
Route demand of each source in C to the hub node via the unique path in T. 
Route demand of C at the hub directly to the sink on {w,t). 

Remove C from R and set C = 0. 

If R is not empty, repeat the main step. 

If R is empty, go to the final step. 

Final step: 

If i? yf 0, then route all the demand in i? to t via the unique paths in T. 

Install one copy of cable on the edges of T which have positive flow. 

Collect source nodes : 

Add M to C, if V e R. 

Let vi, ... ,Vk be the children of v. 

If w ^ V, then 

Let Vp be the child of v such that the hub node w is in 
Add T„p n i? to C. 

While dem{C) < u/2, 

Pick an unprocessed child of v, say Vi. 

Add Ty. D Rto C. 

Return C. 



Lemma 2. The algorithm routes demand such that: 

1 ) flow on any edge of the tree T is at most u, and 

2) flow on a direct edge added by the algorithm is at least u/2 and at 

most u. 

Proof. The proof is simpler compared to the uniform-demand case because the 
algorithm does not assign any subtree partially. Consider an edge e of T. Let v 
be incident on e such that e is not in Ty . Since all the sources in Ty are collected 
in the same set by the algorithm, demand of these sources is routed to a hub 
node either out of the subtree, or in the subtree, but not both. Thus, flow exists 
only in one direction. If the demand of sources is routed to a hub node out of 
Ty , then outflow is at most u — 1 . If the demand is routed to a hub node in the 
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subtree, then inflow is at most u — 1. Thus, for any edge of T, flow does not 
exceed u. 

Due to the subtree selection rule in the algorithm, if a subtree Ty is selected, 
then all the subtrees rooted at its children have remaining demand strictly less 
than m/2. Therefore, the first time dem{C) exceeds u/2, it will be at most u 
so that total flow on the direct edges added by the algorithm is in the range 
[u/2,u]. 



Theorem 2. There is a {2 + psr)- approximation algorithm for the single-sink 
edge installation problem with a single cable type and non-uniform demand. 

Proof. We use the same definitions of Cqpt, Cheur, Cdr and Cst as in the 
proof of Theorem 1. 

By Lemma 2, at most one copy of the cable is sufficient to accommodate flow 
on the edges of the approximate Steiner tree T. Therefore, Cst < PstCopt- 
For a source set Ck collected at iteration k, the algorithm installs one copy 
of the cable on the shortest direct edge from the subtree T„, which encloses 
Ck, to t. By Lemma 2, at most one copy of cable is sufficient to accommodate 
flow on the direct edges from hub nodes to t and dem{Ck) > u/2. The term 
Ss 6S ■dist{si,t) is a lower bound on Cqpt as in the uniform demand case. 
Since source sets collected by the algorithm have disjoint sources and demand 
from a set Ck is sent via the source in Ck that is closest to t (the hub node Wk), 

Copt > ^2/ '^2 dist{sj,t) > ^ demi ^ dist{si,f)). (3) 

k Si^Ck k Si^Ck 

Since derrii > ^ and ming^gC^ dist{si,f) = dist{wk,t), we have 

Copt > EE ^dist{wk,t) = ^Cdr. (4) 

k Si^Ck 

Therefore, Cheur = Cst + Cdr < (2 + pst)Copt. 

4 Extensions 

Our methods apply to the following extension of the local access network design 
problem: Instead of specifying a single sink node, any node v in the graph can 
be used as a node that sinks u units of demand at a cost of /„ . A node is allowed 
to sink more than u units of demand by paying • fv cost to sink dem 

units of flow. The problem is to open sufficient number of sinks and route all the 
demands to these sinks at minimum cable plus sink opening costs. 

To model this extension, we extend the metric in two steps: 1) create a new 
sink node t with edges to every vertex v of cost /„, 2) take the metric completion 
of this augmented network. Notice that the second step may decrease some of 
the costs on the edges incident on the new sink t (e.g., if fi + dist{j,i) < fj, 
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then the cost of the edge (j,t) can be reduced from fj to fi + dist{j,i)), or 
between any pair of original nodes (e.g., if dist{i,j) > fi + fj, then we may 
replace the former by the latter). Bearing this in mind, it is not hard to see that 
any solution in the new graph to the single cable problem with t as the sink and 
with the modified costs can be converted to a solution to the original problem 
of the same cost. Thus, our algorithms in the previous sections apply to give the 
same performance guarantees. 
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Abstract. We consider a fault tolerant version of the metric facility 
location problem in which every city, j, is required to be connected to 
Tj facilities. We give the first non-trivial approximation algorithm for 
this problem, having an approximation guarantee of 3 • Hk, where k is 
the maximum requirement and Hk is the fc-th harmonic number. Our 
algorithm is along the lines of [2] for the generalized Steiner network 
problem. It runs in phases, and each phase, using a generalization of 
the primal-dual algorithm of [4] for the metric facility location problem, 
reduces the maximum residual requirement by 1. 



1 Introduction 

Given costs for opening facilities and costs for connecting cities to facilities, 
the uncapacitated facility location problem seeks a minimum cost solution that 
connects each city to a specified number of open facilities. In the fault tolerant 
version, each city must be connected to a specified number of facilities. Formally, 
we are given a set of cities and a set of facilities. For each city we are given its 
connectivity requirement and for each facility we are given its opening cost. For 
each city-facility pair, we are given the cost of connecting the city to the facility. 
We assume that the connection costs satisfy the triangle inequality. We want to 
open facilities and connect each city to as many open facilities as its connectivity 
requirement such that the total cost of opening facilities and connecting cities is 
minimized. This problem has potential industrial applications where the facilities 
and the connections are susceptible to failure. 

We give a 3 • Hk factor approximation algorithm, where k is the maximum 
requirement and Hk = 1-I-1/2 -1-1/3 -!-••• + 1/fc. Our algorithm is along the 
lines of [2] for the generalized Steiner network problem. It runs in phases, and 
in each phase, reduces the maximum residual requirement by 1. In each phase 
it considers only those cities which have the maximum residual requirement. 
The procedure for a phase will give each of these cities one more connection to 
open facilities. In contrast to the usual facility location problem, a facility may 
not provide a new connection to every city. We show that a generalization of 
primal-dual algorithm in [4] works for each phase with a performance factor of 
3. In the case of the generalized Steiner network problem, adapting the primal- 
dual algorithm for the Steiner forest problem to a phase of generalized Steiner 
network problem took significant work [5]. In contrast, in the case of facility 
location problem, this adaptation is straight forward, demonstrating a strength 
of primal-dual schema in facility location problem [4] . 
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2 The Fault Tolerant Metric Uncapacitated Facility 
Location Problem 

The uncapacitated facility location problem seeks a minimum cost way of con- 
necting cities to open facilities. It can be stated formally as follows: Let G be 
a bipartite graph with bipartition (F, C), where F is the set of facilities and C 
is the set of cities. Let fi be the cost of opening facility i, rj be the number of 
facilities city j should be connected to, and be the cost of connecting city j to 
(opened) facility i. The problem is to find a subset / C F of facilities that should 
be opened, and a function (f :C^ 2^ assigning cities to a set of open facilities in 
such a that each city j is assigned to a set of cardinality rj and the total cost of 
opening facilities and connecting cities to open facilities is minimized. We will 
consider the metric version of this problem, i.e., the Cij’s satisfy the triangle 
inequality. 



Consider the following integer program for this problem. In this program, 
Pi is an indicator variable denoting whether facility i is open, and Xij is an 
indicator variable denoting whether city j is connected to the facility i. The first 
constraint ensures that each city,j, is connected to at least rj facilities and the 
second ensures that each of these facilities must be open. 


minimize 


^ ^ CijXij J- ^ ( fiPi 
iGF,jGC iGF 


(1) 


subject to 


W j ^ C . ^ ) Xij ^ rj 

ieF 

yi e F,j G C : Pi- Xij > 0 
\/i e F,j e C : x^j G {0, 1} 
Vi G F : y,G {0, 1} 




An LP-relaxation of this program is: 




minimize 


^ ' Cij Xij + ^ ( fiPi 
iGF,jGC iGF 


(2) 


subject to 


Vj G C : ^ Xij > rj 





ieF 



\/i e F,j e C : Pi- > 0 
yi G F,j & C : Xij > 0 
yi G F : 1 > 2/i > 0 

The dual program is: 

maximize rjOj — Zi (3) 

jec ieF 

subject to Vi € F, J G C : aj — (3ij < Cij 
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\/i e F : ^ Pij < fi + Zi 
jec 

Wj G C : aj > 0 
Vi G F,j G C : /3y > 0 

We will adopt the following notation: ric = \C\ and Uf = |F|. The total 
number of vertices ric + rif = n and the total number of edges ric x Uf = m. The 
maximum of rj’s is k. Optimum solution of the integer program is OPT and of 
linear program is OPTf. 

2.1 The High Level Algorithm 

Our algorithm opens facilities and assign them to cities in k phases numbered 
from k down to 1. Each phase decreases the maximum residual requirement, 
which is the maximum number of further facilities needed by a city, by 1 . Hence 
at the beginning of the p-th phase maximum residual requirement is p and at 
the end of it the maximum residual requirement is p — 1 . 

The algorithm starts with an empty solution {Ik, Ok)- The p-th phase of the 
algorithm takes the solution (Ip,Cp) and extend it to (/p_i,Cp_i) such that the 
maximum residual requirement is decreased by one, thereby maintaining the 
loop invariant that the maximum residual requirement with respect to solution 
(Ip,Cp) is p. Hence, (Io,Co) is a feasible solution. In the next section, we will 
show the following theorem. 

Theorem 1. Cost of {Ip-i,Cp-i) minus the cost of{Ip,Cp) is at most 3-OPT/p. 
Corollary 1. Cost of (/o,Cb) is at most 3 • HkOPT. 

3 The p-th Phase 

This phase extends the solution (Ip,Cp) to (/p_i,Cp_i) so that the each city, j, 
with residual requirement of p with respect to the solution {Ip,Cp) gets connected 
to at least one more open facility. This can happen in two ways, first a new facility 
is opened in (/p_i,Cp_i) and j is connected to that. Second, j is connected to 
already open facility in {Ip,Cp) to which it was not connected. In the first case, 
both the facility and the connection must be paid in this phase itself whereas in 
the second case only the connection needs to be paid. 

So in this phase, facilities are of two types, free and priced. The set of free 
facilities is Ip. A priced facility if opened can be used by any city whereas a free 
facility can be used by only those cities which are not already using it. So denote 
the set of cities with residual requirement of p by Cp. The problem of this phase 
can be written as the following integer program. 

minimize ^ CijXij + ^ /ij/i (4) 

i^Ffj^Cp i^F — Ip 

subject to Vj G Cp : ^ Xij > 1 

i&F-C^j) 
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G F Ip<^ j G Cp ‘ yi Xij ^ 0 
Vi G F,j G C : Xij G {0, 1} 

ViG F- Ip-. y,G {0, 1} 

An LP-relaxation of this program is: 

minimize ^ CijXij + ^ /ij/i (5) 

i^Fij^Cp i^F — Ip 

subject to Vj G Cp : ^ Xij > 1 

*eF-Cp(i) 

Vi G F ^p<f j G Cp : yi Xij ^ 0 
Vi G F,j G C : Xij > 0 
Vi G F : yi>0 

The dual program is: 
maximize 

j^Op 

subject to yi G F — Ipjj G Cp : aj — (3ij < cij 
y z G Ip 1 J G Cp . ocj Cl ^ij 

ViG F - Ip-. ^ (lij < fi 
iec 

Vj G C : aj > 0 
Vi G F,j gC: f3ij>0 

Theorem 2. Optimum solution of LP 5 is at most OPTf /p. 

Proof. Let optimum solution of LP 5 is OPTp. By strong duality theorem of 
linear programming theory, there is a dual feasible solution for LP 6 of value 
OPTp. Let (a,P) be one such solution satisfying LP 6. Following procedure 
extends this solution to a feasible dual solution to LP 3 of value p ■ OPTp, hence 
proves the theorem. 

1. Vj gC- Cp, aj ^ 0. 

2. VjGC- Cp, i G F, Pij ^ 0. 

3. Vj G Cp,i GCp{j), Pij < aj. 

4. Vi G Ip, Zi = I2j(£Cp 

Denote this extended solution by {a,(I,z). One can easily check that {a,(I) 
is a feasible solution to LP 3. Its value is Hj^c 0*^1 ~ Hi^F ~ Hj^Cp 0*^1 ~ 

SieF HjdCp = HjdCp ~ HjdCp HidCpU) = yjeCp ~ 

I2jeCp CeCpU) = EjeCp \Cp{j)\aj = EjeCpi^j ~ \Cp(j)\)aj 

= = PEjeCp = P ' OPTp. 
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In the next section we will adapt the primal-dual algorithm of [4] to show 
the following theorem. 

Theorem 3. Cost of {Ip-i,Cp-\) minus the cost of {Ip,Cp) is at most 3-OPTp. 

Corollary 2. Cost of {Ip-i,Cp-i) minus the cost of (Ip,Cp) is at most 3 • 
OPT/p. 



4 Primal-Dual Algorithm for the p-th Phase 

Our algorithm is essentially the same as the primal-dual algorithm in [4] except 
for the following differences. 

1. Duals of only those cities which have residual requirement of p will be raised. 

2. Facilities in Ip are free, others carry there original costs. 

3. Connection already used in (Ip,Cp) are of infinite costs. Cost of other con- 
nections remain the same. 

For completeness, we are reproducing the primal-dual algorithm of [4] with 
the above mentioned changes. The algorithm runs in two phases. The first phase 
runs in a primal-dual fashion to find a tentative solution and the second modifies 
it so that the primal becomes at most the thrice of the dual. The algorithm has a 
notion of time. It begins at time zero with a zero primal and a zero dual solution. 
At time zero, all cities in Cp are unconnected, all facilities except free facilities 
are closed. Free facilities are open. 

As the time passes the algorithm raises the dual variable aj for each uncon- 
nected city uniformly at rate one. Now the following two kinds of events can 
happen: 

1. Dual constraint corresponding to a connection, ij, goes tight i.e., aj — Pij = 
Cij. Such a connection is declared tight. The algorithm performs one of the 
following step according to the state of facility i. 

(a) If facility i is (tentatively) open then city j is declared (tentatively) 
connected to facility i. Dual variable for this city will not be raised any 
further. 

(b) If facility i is closed then flij will begin responding to the raise of aj i.e., 
whenever aj will be raised (3ij will also be raised by the same amount 
to maintain the feasibility of aj — Pij < Cij. 

2. Dual constraint corresponding to a facility i goes tight i.e., ^j^c ~ /*■ 
This facility is declared tentatively opened. Every unconnected city have a 
tight edge to this facility is declared tentatively connected to this facility. 

The first phase of the algorithm ends when there is no more unconnected 
city. A city j is said to be overpaying if there are at least two tentatively open 
facilities i\ and such that both and j3i.^j are positive. The second phase 
picks a maximal set of tentatively open facilities such that no city is overpaying. 
All facilities in this maximal set are opened and all other tentatively opened 
facilities are closed. Any city having a tight edge to an open facility is declared 
connected to it. The next lemma follows by this construction. 
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Lemma 1. 

fi = aj. 

ieFjeCp and j is connected ieF-ip Q^fji j ig connected 

The performance gap of factor 3 comes from the tentatively connected cities. 
Consider a tentatively connected city j. Suppose it was tentatively connected to 
facility i, which got closed. Since we picked a maximal set of tentatively opened 
facility such that no city is overpaying, there must be a city j' which was paying 
to this facility i and an opened facility say i' . City j is connected to the facility 
i'. The next lemma establishes the performance guarantee of 3. 

Lemma 2. For any tentatively connected city, connection cost is at most the 
three times the dual raised by it. 




Let ti and U/ respectively be the times at which the facilities i and i' are 
declared tight. The proof follows from the following three observations and the 
triangle inequality. Note that the facility i' ^ Ip, hence the triangle inequality is 
maintained for this situation. 

1. Since j is declared tentatively connected to i, aj > ti and Uj > Cij. 

2. Since connection ij' and i'j' both are tight, aji > Cij/ and aj/ > Ciij/. 

3. Since, during the first phase, aj' is stopped being raised as soon as one of 
the facilities j' has a tight edge to is tentatively opened, aj' < min(ti,C'). 

Using first and the last we have aji < aj, which together with the second 
gives, Cij + Cij! + Ci'j' < a • aj. Hence by triangle inequality we get < 3 • 
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Abstract. A tree (tour) cover of an edge-weighted graph is a set of 
edges which forms a tree (closed walk) and covers every other edge in 
the graph. 

Arkin, Halldorsson and Hassin (Information Processing Letters 47:275- 
282, 1993) give approximation algorithms with ratio 3.55 (tree cover) 
and 5.5 (tour cover). We present algorithms with worst-case ratio 3 for 
both problems. 



1 Introduction 

1.1 Problem Statement and Notation 

Let G = {V, E) be an undirected graph with a (nonnegative) weight function 
c : A ^ Q+ defined on the edges. A tree cover {tour cover) of G is a subgraph 
T = {U,F) such that (1) for every e G E, either e G F or F contains an edge 
/ adjacent to e: F n N{e) yf 0, and (2) T is a tree (closed walk). (We allow the 
tour cover to be a closed walk in order to avoid restricting the weight function c 
to be a metric. Our algorithm for tour cover produces a closed walk in G, but if 
the weight function c satisfies the triangle inequality, this walk may be short-cut 
into a simple cycle which covers all edges in E without increasing the weight.) 

The tree cover (tour cover) problem consists in finding a tree cover (tour 
cover) of minimum total weight: 



min^ Ce, 

eCF 

over subgraphs H = {U, F) which form a tree cover (tour cover) of G. 

* Supported in part by the W. L. Mellon Fellowship. 

** Supported in part by the NSF CAREER grant CCR-9625297. 
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K. Jansen and S. Khuller (Eds.): APPROX 2000, LNCS 1913, pp. 184—193, 2000. 
© Springer- Verlag Berlin Heidelberg 2000 




Improved Approximations for Tour and Tree Covers 



185 



For a subset of vertices S' C F, we write S{S) for the set of edges with exactly 
one endpoint inside S. If x € is a vector indexed by the edges of a graph 
G = (V,E) and F C E is a, subset of edges, we use x{F) to denote the sum of 
values of x on the edges in the set F, x{F) = X^eeF 



1.2 Previous Work 

The tree and tour cover problems were introduced by Arkin, Haldorsson and Has- 
sin [1]. The motivation for their study comes from the close relation of the 
tour cover problem to vertex cover, watchman route and traveling purchaser 
problems. They provide fast combinatorial algorithms for the weighted ver- 
sions of these problems achieving approximation ratios 5.5 and 3.55 respectively 
(3.55 is slightly lower than their claim — the reason being the recent improve- 
ments in minimum Steiner tree approximation [8]). For unweighted versions 
their best approximation ratios are 3 (tour cover) and 2 (tree cover), and they 
also show how to find a 3-approximate tree cover in linear time. Finally, they 
give approximation-preserving reductions to vertex cover and traveling salesman 
problem, showing that tree and tour cover are MAXSNP-hard problems. 

Our methods are similar to those used by Bienstock, Goemans, Simchi-Levi 
and Williamson [2], also referred to by Arkin et al. as a possible way of improving 
their results; however, our algorithms were developed independently and were in 
fact motivated primarily by the work of Carr, Fujito, Konjevod and Parekh [3] 
on approximating weighted edge-dominating sets. 



1.3 Algorithm Overview 

Both our algorithms run in two phases. In the first phase we identify a subset of 
vertices, and then in the second phase we find a walk or a tree on these vertices. 
Very informally, the algorithms can be described as follows. 

(1) Solve the linear programming relaxation of the tour cover (tree cover) prob- 
lem. 

(2) Using the optimal solution to the linear program, find a set U C V, such 
that V\U induces an independent set. 

(3) Find an approximately optimal tour (tree) on U. 

Part (3) above reduces to the invocation of a known algorithm for approxi- 
mating the minimum traveling salesman tour or the minimum Steiner tree. 



2 Tour Cover 

2.1 Linear Program 



We first describe an integer programming formulation of tour cover. 
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Let T denote the set of all subsets S' of F such that both S and V\S induce 
at least one edge of E, 

T={SCV\ E[S] ^ 0, E[V \ S] ^ 0}. 

Note that if C is a set of edges that forms a tour cover of G, then at least two 
edges of C cross S, for every S G E. This observation motivates our integer 
programming formulation of tour cover. For every edge e G E, let the integer 
variable Xg indicate the number of copies of e included in the tour cover. We 
minimize the total weight of edges included, under the condition that every cut 
in F be crossed at least twice. In order to ensure our solution is a tour we 
also need to specify that each vertex has even degree; however, we drop these 
constraints and consider the following relaxation. 

min CgXg 

e^E 

Xe > 2 for all S' G IF (1) 

ee<5(S) 

a:G{0,l,2}l^l. 

Note that since the optimum tour may use an edge of G more than once, we 
cannot restrict the edge-variables to be zero-one. However, it is not difficult to 
see that under a nonnegative weight function the minimal solution will never 
use an edge more than twice. This follows since an Eulerian tour T\ on a subset 
U GV oi vertices may be transformed into an Eulerian tour T 2 on U such that 
( 1 ) no edge is used in T 2 more times than in Ti and ( 2 ) no edge is used in T 2 
more than twice. 

Replacing the integrality constraints by 

0 < a; < 2, 

we obtain the linear programming relaxation. We use ToC(G) to denote the 
convex hull of all vectors x satisfying the constraints above (with integrality 
constraints replaced by upper and lower bounds on x). 

To show that ToC(G) can be solved in polynomial time we appeal to the 
ellipsoid method [7] and construct a separation oracle. We interpret a given 
candidate solution x as the capacities on the edges of the graph G. For each pair 
of edges € 1,62 G E we compute the minimum capacity cut in G that separates 
them. The claim is that x is a feasible solution iff for each pair of edges € 1,62 G E 
the minimum-capacity ei, C 2 -cut has value at least 2. Clearly, if x is not a feasible 
solution then our procedure will find a cut of capacity less than 2 having at least 
one edge on either side. On the other hand if our procedure returns a cut of 
value less than 2 then x cannot be feasible. 

Notice that the dual of (ToC(G)) fits into the packing framework and the 
above oracle enables us to use fast combinatorial packing algorithms [4,5]. That 
is, we avoid using the ellipsoid method, reducing the time complexity but at the 
cost of losing a (1 -|- e)-factor in the approximation guarantee. 
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2.2 The Subtour Polytope 

Let G = (V, E) be a graph whose edge-weights satisfy the triangle inequality: 
for any u, v, and w G V, 

^UV ^vw ^ ^uw- 

The subtour polytope ST(G) is defined as 

ST(G) = {xG [0, l]l^l I x{6{S)) > 2 VS' C y, 0 ^ S' yf P, 
and x(i5({u})) = 2 Vu € V}. 

In fact, the upper-bound constraints a; < 1 are redundant and 

ST(G) = {x > 0 I x{S{S)) > 2 VS C y, 0 yf s yf y, 
and x(<5({w})) = 2 \/v G V}. 



2.3 The Parsimonious Property 

Let G = (y, E) be a complete graph with edge-weight function c. For every 
pair of vertices i, j G V, let a nonnegative integer be given. The survivable 
network design problem consists in finding the minimum-weight subgraph such 
that for every pair of vertices i, j G V, there are at least edge-disjoint paths 
between i and j. A linear programming relaxation of the survivable network 
design problem is given by 

min CeXe 
ceE 

Xe > max n-i for all S C y {[} S =/= V (2) 

e^d{S) 

X >0. 



Goemans and Bertsimas [6] prove the following. 

Theorem 1. If the weight function c satisfies the triangle inequality then for 
any D CV the optimum of the linear program (2) is equal to the optimum of 



ceE 



ee5(S) 



> 



max 

ioms) 



ee5({v}) 



max r„, 
jeniG 



for all S C y, 0 yf S yf y 



for all V G D 



(3) 



X > 0. 




188 



Jochen Konemann et al. 



2.4 Algorithm 

We are now ready to state our algorithm for tour cover. 

(1) Let X* be the vector minimizing cx over ToC(G). 

(2) Let [/ = {w e y I x*(5({v})) > 1}. 

(3) For any two vertices u, v G U, if uv ^ E, let Cuv be the weight of the shortest 
u-v path in G. 

(4) Run Christofides’ heuristic to find an approximate minimum traveling sales- 
man tour on U . 

The algorithm outputs a tour on U. Since C/ is a vertex cover of G, this tour 
is in fact a tour cover of G. 

We note that there are some trivial cases which our algorithm will not handle. 
However, they can be processed separately, and we briefly mention them here. 
If the input graph is a star, the central node is a solution of weight zero. If 
the input graph is a triangle, doubling the cheapest edge gives us an optimal 
solution. All other cases can be handled by our algorithm. 

2.5 Performance Guarantee 

Theorem 2. Let x* be the vector minimizing cx over ToC(G) and U = {v € 
V I a;*((5({v})) > 1}. Let F denote the (complete) graph with vertex-set U and 
edge-weights c as defined by shortest paths in G. Then 

minjcy | y G ST(F)} < 2min{ca; | x G ToC(G)}. 

Proof. Let y = 2x* . Then, y is feasible for 

A = {x>Q\ cc((5({u})) >0 Wv € V \ U 
x(d({u})) >2 Wu G U 
x{6{S)) >2\/SCV, SnUy^dl, U\Sy^(b, 
x{6{S)) >0 VS'C y\[7, S'yf 0}. 

Notice that A corresponds to the survivable network polytope (2) with require- 
ment function 

_ J 2 , u,v G U 

1^ 0 , otherwise. 

Now let 

B° = {x>0\ x((5({u})) =0 yvGV\U 
cc((I({u})) = 2 Wu € U 
x{6{S)) >2yscv, snUy^O, u\Sy^0, 
x{6{S)) >0 VS'CR\[7, 5'yf0}. 

By the parsimonious property (Theorem 1), 

minjca; | a; G A} = minjca; | x G R°}. 
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We define 



B = {x>Q \ x(5({z;})) =0 G V \ U 
x(5({m})) =2 VuGU 
x{S{S)) >2 \/ScU, (d^Sy^U}, 

that is, B is the subtour polytope ST{F). We next show that B = from 
which it follows that 

minjca; | x G B} = minjca; | x G A}. (4) 

Claim. B = B^. 

Proof. It is clear that B^ C B. Let x G B. Clearly, for % ^ S G V \ U we have 
x(5(S')) > 0. Now, consider some set S with a requirement of 2. We show that 
x{5{S^)) = n U)). The claim then follows from x G B. 

In the following we use U to denote V \U. We also use U : V to denote 
the set of edges with exactly one end point in each of U and V, that is, U : 
V = {uv G E \ u G U, V G V}. Notice that we can express the difference 
x{S{S)) — x{S{S n U)) in the following way 



x{SGU ■. SGU) + 


(5) 


x{S GU'.SGU) — 


(6) 


x{SGU ■. SGU). 


(7) 


Since x G B we know that x(S(v)) = 0 for all v G 


U. Hence the terms (5), (6), 


and (7) above evaluate to zero. 


□ 


The right-hand side of (4) is equal to minjccc 


X G ST(F)}. Now, putting 



together all of the above, we have 

min{cx | x G ST(_F)} = min{cx | x G B} = min{cx \ x G A\ 

< cy = 2cx* = 2min{cx | x G ToC(G)}. 

The first equality here follows from the definition of B. The second equality is 
equation (4), and the inequality is true because y is feasible for A. The final two 
equalities follow from the definitions of y and x*. □ 

Wolsey [11] and Shmoys and Williamson [9] prove the following theorem. 

Theorem 3. Let G = (V,E) be a graph with edge-weight function c satisfying 
the triangle inequality. Then the weight of the traveling salesman tour on G 
output by Christofides’ algorithm is no more than | min{cx | x G ST{G)}. 

From Theorems 2 and 3, and the fact that minjcx | x G ToC(G)} is a lower 
bound on the weight of an optimal tour cover, it follows that the approximation 
ratio of our algorithm for tour cover can be upper-bounded by 3. 

Corollary 1. The algorithm above outputs a tour cover of weight no more than 
3 times the weight of the minimum tour cover. 
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3 Tree Cover 

3.1 Bidirected Formulation 

For tree cover, we follow essentially the same procedure as for tour cover, with 
one difference. We use a bidirected formulation for the tree cover. That is, we first 
transform the original graph into a directed graph by replacing every undirected 
edge uv by a pair of directed edges (u v),{v u) each having the same 
weight as the original undirected edge. We then pick one vertex as the root, and 
search for a minimum-weight branching which also covers all the edges of the 
graph. We denote this directed graph by = {V,l^). 

We do not know which vertex to pick as the root. However, we can simply 
repeat the whole algorithm for every possible choice of the root, and pick the 
best solution. It is easy to see that such a branching has a direct correspondence 
with a tree cover in the original undirected graph, having the same weight. 



3.2 Linear Program 

For a fixed root r, define F to be the set of all subsets S of F \ {r} such that S 
induces at least one edge of 1?, 



^ = {5cf\h I 

If C is a set of edges forming a tree cover of G and containing r, then let 
Cf denote the branching obtained by directing all edges of C towards the root 
r. Now for every S G F, Tf must contain at least one edge leaving S. We use 
to denote the set of directed edges leaving the set S. Hence we have the 
following IP formulation. 



min E CeXe 

eel? 

Xe > 1 for all S' e (8) 

eei+(S) 

xe {0,1}I^I. 

Replacing the integrality constraints by 

X > 0, 

we obtain the linear programming relaxation. We use TrC(C?) to denote the 
convex hull of all vectors x satisfying the constraints above. 
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3.3 Quasi-Bipartite Bidirected Steiner Tree Polytope 

A graph G = (V,E) on which an instance of the Steiner tree problem is given 
by specifying the set R CV of terminals is called quasi-bipartite if S' = T \ i? in- 
duces an independent set. Rajagopalan and Vazirani [10] give a |-approximation 
algorithm for the quasi-bipartite Steiner tree problem using a bidirected cut re- 
laxation. 

For a specific choice of a root vertex r, the quasi-bipartite bidirected Steiner 
tree polytope QBST(G[i?]) is defined as 



QBST(C^) ={xe [0, Ijl^l I x(<5+(S)) >1 VS C y \ {r}, S n S yf 0}. 



3.4 Algorithm 

We are now ready to state our algorithm for tree cover. 

(1) For every vertex r G V, let x* be the vector minimizing cx over TrC(^) 
with r as the root. 

(2) LetC/={uGy|4(5+(M))>i}. 

(3) For any two vertices u, v G U, if uv ^ E, let Cuv be the weight of the shortest 
u-v path in G. 

(4) Run the Rajagopalan-Vazirani algorithm to find an approximate minimum 
Steiner tree on , with U as the set of terminals, and call this T^. 

(5) Pick the cheapest such T^. 

Note that we are able to solve the linear program in step (1) in essentially 
the same way as the tour cover LP, appealing to the ellipsoid method and using a 
min-cut computation as a separation oracle. Trivial cases exist for this problem 
too; they can be handled similar to the way we handle the tour cover trivial 
cases. The algorithm initially yields a branching in the bidirected graph. We 
map this in the obvious way to a set of edges in the original undirected graph. 
Some of the edges in this set may be redundant since we were working on the 
metric completion of the directed graph; we prune the solution to get a tree 
without any increase in weight. 

The algorithm outputs a tree which spans U (and possibly other vertices). 
Since C/ is a vertex cover of G, this tree is in fact a tree cover of G. 



3.5 Performance Guarantee 

Theorem 4. Let x* be the vector minimizing cx over TrC(^) and U = {v £ 
V I a;*(5+({u})) > i}. Then 

minjcj/ I y € QBST(G[C/])} < 2 minjca: I a; G TrC(^)}. 
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Proof. Consider an edge = uv G 1^ . Since x* G TrC(^), we have that 
a;*(5+({'u, r!})) > 1. Hence, either x*((5+({m})) > 5 or a;*(5+({w})) > and U 
is a vertex cover of G. Note that C \ C/ is an independent set because for all 
u,v G V \ U, we have x{S^{u)) < 5 and x(i5+(?;)) < 5 so that uv ^ E. 

Now consider the vector y = 2x* . Clearly cy = 2cx*. Also clearly y G 
QBST(G[C/]). Hence if y* is the minimizer of {cy \ y G QBST(G[C/])}, then 
cy* <cy = 2cx*. □ 

Rajagopalan and Vazirani[10] prove the following. 

Theorem 5. Let G = (V,E) be a graph with edge-weight function c satisfying 
the triangle inequality. Let V = R S he a partition of the vertex set such 
that G has no edges both of whose end points are in S. Then we can find in 
polynomial time a Steiner tree spanning R of weight no more than | minjca: | x G 
QBST(^)}. 

From Theorems 4 and 5 it follows that the approximation ratio of our algo- 
rithm for tree cover can be upper-bounded by 3. 

Corollary 2. The algorithm above outputs a tree cover of weight no more than 

3 times the weight of the minimum tree cover. 

4 Conclusion 

4.1 Gap Examples: Linear Program, Algorithm 

We do not have examples where the worst-case performance of our algorithm is 
actually achieved. However, we do have examples where the ratio of our solution 
to the LP solution is equal to the performance guarantee. 

For the tour cover problem, consider the unit complete graph. It is easy to 
see that an optimal LP solution is obtained by setting Xg = for each edge in 
the graph. This solution has value Our algorithm will round this to 

a tree, which could yield a star having n — 1 edges and all nodes of odd degree. 
The second stage will then yield a tour having roughly |(n — 1) edges, which is 
of weight 3 times the LP solution. 

We are not aware of any graph for which the Rajagopalan- Vazirani algorithm 
achieves its worst case bound of |. Hence for the tree cover, we do not have an 
example where the ratio of our solution to even the LP optimum is 3. However, 
for the complete unit graph, it is easy to see that the integrality gap is at least 2. 



4.2 Further Open Questions 

Obtaining approximation algorithms with better approximation guarantees is an 
obvious open question. We note that we do not have examples where either algo- 
rithm actually achieves its worst-case performance bound, so it may be possible 
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to improve the performance guarantees of our algorithms with tighter analyzes. 
The directed version of both problems remains wide open. 

We also note that we use a two stage procedure to solve these problems. A 
single procedure which directly puts us in the desired polytopes might yield a 
better approximation ratio. 
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Abstract. We generalize and unify techniques from several papers to 
obtain relatively simple and general technique for designing approxima- 
tion algorithms for finding min-cost fc-node connected spanning sub- 
graphs. For the general instance of the problem, the previously best 
known algorithm has approximation ratio 2k. For k < 5, algorithms with 
approximation ratio [(fc-|- l)/2] are known. For metric costs Khuller and 
Raghavachari gave a (2 -t- {-approximation algorithm. We obtain 

the following results. 

(i) An I(k — feo) -approximation algorithm for the problem of making 
a fco-connected graph fc-connected by adding a minimum cost 

edge set, where 7(fc) = 2 + J Lj^J ' 

(ii) A (2 -I- {-approximation algorithm for metric costs. 

(iv) A [(fc -I- l)/2] -approximation algorithm for fc = 6, 7. 

(v) A fast \{k + l)/2] -approximation algorithm for fe = 4. 

The multiroot problem generalizes the min-cost fc-connected subgraph 
problem. In the multiroot problem, requirements for every node u are 
given, and the aim is to find a minimum-cost subgraph that contains 
ma.x{ku,kv} internally disjoint paths between every pair of nodes u,v. 

For the general instance of the problem, the best known algorithm has 
approximation ratio 2k, where k = maxfc„. For metric costs there is a 
3-approximation algorithm. We consider the case of metric costs, and, 
using our techniques, improve for k < 7 the approximation guarantee 
from 3 to 2 -b < 2.5. 

1 Introduction 

A basic problem in network design is given a graph Q to find its minimum cost 
subgraph that satisfies given connectivity requirements (see [10,6] for surveys). A 
fundamental problem in this area is the survivable network design problem: find 
a cheapest spanning subgraph such that for every pair of nodes {u, v}, there are 
at least kuv internally disjoint paths between u and v, where is a nonnegative 
integer (requirement) associated with the pair {u, w}. No efficient approximation 
algorithm for this problem is known. 

A p- approximation algorithm for a minimization problem is a polynomial 
time algorithm that produces a solution of value no more than p times the value 
of an optimal solution; p is called the approximation ratio of the algorithm. 
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A particular important case of the survivable network design problem is the 
problem of finding a cheapest fc-connected spanning subgraph, that is the case 
when kuv = k for every node pair {m,u}. This problem is NP-hard for k = 2. 
Ravi and Williamson [13] presented a 2i7(A:)-approximation algorithm, where 
H{k) = l + 5 + + However, the proof of the approximation ratio in 

[13] contains an error. The algorithm of [13] has k iterations; at iteration i the 
algorithm finds an edge set Fi such that Gi = {V, Fi U ■ ■ ■ LI Fi) is i-connected. 
At the end, Gk is output. There is an example [14] showing that the edge set Fk 
found at the last iteration has cost at least k/2 times the value of an optimal 
solution. On the other hand, it is easy to get a 2/c-approximation algorithm, see 
for example [3]. A ]"(fc + l)/2] -approximation algorithms are known for k < 5; 
see [11] for k = 2, [2] for k = 2,3, and [5] for k = 4,5. 

For metric costs and k arbitrary, Khuller and Raghavachari [1 1] gave a (2 -|- 
^^^p^)-approximation algorithm (see also a 3-approximation algorithm in [3]). 

We extend and generalize some of these algorithms, and unify ideas from [11], 
[2,5], [3], and [9] to show further improvements. Among our results are: (i) An 
I{k — fco)“approximation algorithm for the problem of making a fco~connected 
graph fc-connected by adding a minimum cost set of edges, where /(fc) = 2 + 

^ J (note that /(fc) < fc for fc > 7); (ii) A (2 -|- ^^)-approximation 

algorithm for metric costs; (iii) An algorithm for fc = 6, 7 with approximation 
ratio [(fc-b l)/2] = 4; (iv) A fast ]"(fc-l- l)/2] -approximation algorithm for fc = 4. 

Particular cases of the survivable network design problem, where pairwise 
node requirements are defined by single node requirements arise naturally in 
network design. In the multiroot problem, requirements fc„ for every node u are 
given, and the aim is to find a minimum-cost subgraph that contains maxjfcu, fc^} 
internally disjoint paths between every pair of nodes u, v. A graph is said to be 
k-outconnected from a node r if it contains fc internally disjoint paths between r 
and any other node; such node r is usually referred as the root. It is easy to see 
that a subgraph is a feasible solution to the multiroot problem if and only if it is 
fcu-outconnected from every node u. Given an instance of the multiroot problem, 
we use q to denote the number of nodes u with fc„ > 0, and fc = maxfc„ is the 
maximum requirement. Note that the min-cost fc-connected subgraph problem 
is a special case of the multiroot problem when fc„ = fc for every node u. 

The one root problem (i.e., when g = 1) was considered long ago. We now 
describe a 2-approximation algorithm for the one root problem. Let us say that 
a directed graph D is fc-outconnected from r if in 11 there are fc internally 
disjoint paths from r to any other node. For directed graphs, Frank and Tardos [7] 
showed that the problem of finding an optimal fc-outconnected from r subdigraph 
is solvable in polynomial time; a faster algorithm is due to Gabow [8]. This 
implies a 2-approximation algorithm for the (undirected) one root problem, as 
follows. First, replace every undirected edge of G by the two antiparallel directed 
edges with the same ends and of the same cost. Then compute an optimal fc- 
outconnected from r subdigraph and output its underlying (undirected) simple 
graph. It is easy to see that the output subgraph is fc-outconnected from r, and 
has cost at most twice the value of an optimal fc-outconnected from r subgraph. 
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see [11]. The algorithm can be implemented to run in time using the 

algorithm of [8] . 

For the multiroot problem, a 2g-approximation algorithm follows by applying 
the above algorithm for each root and taking the union of the resulting q sub- 
graphs. The approximation guarantee 2q of this algorithm is tight for q < k, see 
[3]. For metric costs and k arbitrary, Cheriyan et. al. [3] gave a 3-approximation 
algorithm. For metric costs and fc = 2, it can be shown that the problem is 
equivalent to that of finding a 2-connected subgraph (for the latter, there is a 
3/2-approximation algorithm). We consider the case of metric costs, and improve 
for 3 < A: < 7 the approximation ratio from 3 to 2 -|- < 2.5. 

This paper is organized as follows. Sect. 2 contains preliminary results and 
definitions. Sect. 3 gives applications of our techniques. For the min-cost k- 
connected subgraph problem: an algorithm for arbitrary costs (Sect. 3.1); a (2-|- 
^^)-approximation algorithm for metric costs (Sect. 3.2); a 4-approximation 
algorithm for k G {6,7} (Sect. 3.3); a fast 3-approximation algorithm for A: = 4 
(Sect. 3.4). For the metric multiroot problem: a 2.5-approximation algorithm for 
k<7 (Sect. 3.5). 



2 Definitions and Preliminary Results 

All the graphs in the paper are assumed to be simple (i.e., without loops and 
parallel edges). An edge with endnodes u,v is denoted by uv. For an arbitrary 
graph G, V{G) denotes the node set of G, and E{G) denotes the edge set of 
G. Let G = (V,E) be a graph. For any set of edges and nodes U = E' U V , 
we denote by G \ C/ (resp., GUU) the graph obtained from G by deleting U 
(resp., adding U), where deletion of a node implies also deletion of all the edges 
incident to it. For a nonnegative cost function c on the edges of G and a subgraph 
G' = {V , E') of G we use the notation c(G') = c{E') = ^{c{e) : e G E'}. 

Let G = (V,E) be a graph, and let X C V. We denote by Eq{X) or simply 
by E{X) the set {u G V\X : su G E for some x G X} of neighbors of X, and by 
7 (A) = 7 (y(A) = |r'G(A)| its cardinality. Let X* = F \ (A U T(A)) denote the 
“node complement” of A. Note that E(y) = T(0) = 0. Also, for any X,Y CV 
holds r{X*) C F(A) and (A U Y)* = X* D A*: thus T(A* n Y*) C T(A U Y). 
The following proposition is known (e.g., see [9, Lemma 1.2]). 

Proposition 1. In a graph G = (V,E), for any X,Y CV holds: 

7 (A) + 7 (A)> 7 (AnA)+ 7 (Aur) (1) 

Let V be an arbitrary groundset. Two sets X,Y cV cross (or A crosses Y) 
if A n A yf 0 and neither A C A nor A C A. Let C be a collection of proper 
subsets of V. Let v{C) be the maximum number of pairwise disjoint sets from C. 
We say that U CV covers C, or that U is a, C -cover, if AnG yf 0 for every A G C. 
Let t(C) be the minimum cardinality of a C-cover in G. Clearly, t(C) > v{C). 
We say that A G C is C -minimal if A does not contain any other set from C. 
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Clearly, if {7 is a cover of all the C-minimal sets, then U also covers C. Note that 
if every set in C is C-minimal, then any two sets in C are either disjoint or cross. 

We say that X CV \s l-tight if ^{X) = I and X* yf 0 (i.e., if 7(7f) = I and 
IXI < \V\ — I — 1). A graph G is k-{node)- connected if it contains k internally 
disjoint paths between any pair of its nodes. By Menger’s Theorem, G is k- 
connected if and only if |C(G)| > k + 1 and there are no /-tight sets with 
I < k — 1 in G. Let Ci{G) denote the collection of all the inclusion minimal 
sets from {X C C : A" is /'-tight, /' < /}. Note that every set in C/(G) is C;(G)- 
minimal; thus any two sets in C;(G) either cross or are disjoint. For brevity, let 
^'i(G) = v{Ci{G)), n{G) = r(C;(G)), and 17 C C is an l-coverif U covers Ci{G). 

An edge e of a graph G is said to be critical w.r.t. property P if G satisfies 
property P, but G \ e does not. The following theorem is due to Mader. 

Theorem 1 (Mader, [15]). In a k-connected graph, any cycle in which every 
edge is critical w.r.t. k- connectivity contains a node of degree k. 

As was pointed in [9], this implies that if 7 (u) > fc — 1 for every v G V{G), 
and if F is an inclusion minimal edge set such that G U P is fc-connected, then F 
is a forest. Note that if U is an /-cover, then for any X G C;(G) holds: AT n 17 yf 0 
and X* n /7 yf 0. Thus we have: 

Corollary 1. Let R he a {k — l)-cover in a graph G, and let E' = {uv : u yf 
V G R}. Then GU P' is k-connected. Moreover, if j{v) > k—1 for every v €V, 
and if F C E' is an inclusion minimal edge set such that GU E is k-connected, 
then |P| < |i?| — 1. 

The following property of /c-outconnected graphs can be easily deduced from 
[2, Lemma 3.1]. 

Lemma 1 ([2], Lemma 3.1). Let G he k-outconnected from r, let H = G \ r, 
and let S G C he an l-tight set in H, I < k — 1. Then r G Fa{S) (so S is 
(/ -I- 1) -tight in G), and 

(i) [S' n Fc{r)\ > k — 1; thus Fair) is a {k — l)-cover in H, and Fa{r) \v is a 
{k — l)-cover in G for any v G Fair). 

(ii) I > k—[^^^^\; thus H is connected, and G is {k— -1-1)- 

connected. 

Throughout the paper, for an instance of a problem, we will denote by Q the 
input graph, and by opt the value of an optimal solution; n = |C(1/)| denotes the 
number of nodes in Q, and m = \E{Q) \ the number of edges in Q . We assume that 
Q contains a feasible solution; otherwise our algorithms can be easily modified 
to output an error. 

For the min-cost fc-connected subgraph problem, we can assume that Q is 
a complete graph, and that c(e) < opt for every edge e of Q. Indeed, let G = 
(V,E) be a /c-connected graph, and let st G E. Let Egt be the edge set of 
cheapest k internally disjoint paths between s and / in Then (G \ st) U Fgt 
is /c-connected and, clearly, c(Fst) < opt. Note that E^t as above can be found 
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in 0(n log n(m + n log n)) time by a min-cost fc-flow algorithm of [12] (the node 
version), and flow decomposition. We also use the following lemma (for a proof 
see the full version). 

Lemma 2. Let G he a subgraph of Q containing I internally disjoint paths be- 
tween two nodes s,t G V{G). For p > I 1 let FP be an optimal edge set such 
that G U contains p internally disjoint paths between s and t. Then for any 
k>l + l holds: c(F'+i) < ^c(F'=). 

The main idea of most of our algorithms is to And a certain subgraph of 
Q of low cost and with a small cardinality (fc — l)-cover or augmenting edge 
set. Such a subgraph is found by using the following two modifications of the 
2-approximation algorithm for the one root problem. Each one of these modifi- 
cations outputs a subgraph of Q of cost < 2opt (here opt is the cost of an optimal 
/c-connected subgraph of Q) and a (fc — l)-cover R of the subgraph. 

The first modification is from [11], and we use it for the metric case. Let Gr 
be a graph constructed from G by adding an external node r and connecting it 
by edges of cost 0 to an arbitrary set R of at least k nodes in G- We compute a 
/c-outconnected from r subgraph G of Gr using the 2-approximation algorithm 
for the root r, and output H = G \ r. As was shown in [11], c(iL) < 2opt. By 
Lemma l(i), i? is a (A: — l)-cover of iJ. We shall refer to this modification as the 
External Outconnected Subgraph Algorithm (EOCSA). It can be implemented in 
0{k‘^n^m) time using the algorithm of [8]. 

The second modification is from [2,5]. It finds a subgraph G and a node 
r such that: G is /c-outconnected from r, 7g(?’) = k, and c(G) < 2opt. The 
time complexity of the algorithm is 0{kfn^m) for the deterministic version, 
and 0{k^n^mlogn) for the randomized one. By Lemma 1 (i), R = Fci^r) \u is a 
{k— l)-cover in G for any v G Fa(r). We shall refer to the deterministic version as 
the Outconnected Subgraph Algorithm (OCSA), and for the randomized version 
as the Randomized Outconnected Subgraph Algorithm (ROCS A). 

3 Applications 

3.1 Min-Cost fc-Connected Subgraphs 

It is not hard to get a /c-approximation algorithm for the min-cost /c-connected 
subgraph problem as follows. We execute OCSA (or ROCSA) to compute a 
corresponding subgraph G of G- Let v G Fair) be arbitrary, and let R = rG{r)\v. 
Recall that, by Lemma 1 (i), i? is a (A: — l)-cover in G. We then And an edge 
set F as in Corollary 1, so GU F is A:-connected and F is a forest on R. Finally, 
we replace every edge st G F by a, cheapest set Fgt of k internally disjoint paths 
between s and t in G- By [2], c(G) < 2opt. Since |F| < A: — 2, the cost of the 
output subgraph is at most 2opt {k — 2)opt = kopt. 

We can get a slightly better approximation ratio by executing OCSA and 
then iteratively increasing the connectivity by 1 until it reaches k. The proof of 
the approximation ratio is based on Lemma 4 to follow which implies that an 
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^-connected graph G can be made {I + l)-connected by adding at most t^i{G) 
edges. 

Let G = (V,E) be an ^-connected graph, and let X C F be an /-tight set in 
G. We say that X is small if \X\ < otherwise X is large. Clearly, if X is 

large, then X* is small, and X and X* are both small if and only if |X| = 

Note that G is (/ -I- l)-connected if and only if it has no small /-tight sets. The 
following lemma is from [13] (for a short proof see the full version). 

Lemma 3 ([13], Lemma 3.4). Let X,Y he two intersecting small l-tight sets 
in an I connected graph G. Then 

(i) X DY is a small l-tight set; 

(a) X UY, {X U F)* are both l-tight, and at least one of them is small. 

As a consequence, in an /-connected graph G, no small /-tight set crosses a 
minimal small /-tight set. Thus any two distinct minimal small /-tight sets are 
disjoint. Let f’/(G) denote the number of minimal small /-tight sets in G. Note 
that m{G) < m{G), and that G is (/ -I- l)-connected if and only if />/(G) = 0. Let 
us call an edge e weakly saturating for G if hi^G U e) < vi{G) — 1. 

Lemma 4. Let R be a cover of all small l-tight sets of an l-connected graph G. 
If R is not an l-cover, then there is a weakly saturating edge for G. 

Proof. Let i? be a cover of all small /-tight sets of G. If R is not an /-cover, then 
there is T G Ci{G) such that T is large, and T n R = 0. Clearly, T* is small. 
Let S' C T* be an arbitrary minimal /-tight set. Clearly, S is small. Consider the 
collection T> of all (inclusion) maximal small /-tight sets containing S. Let D be 
the union of the sets in T>. Note that T* G T>. By Lemma 3 (ii), exactly one of 
the following holds: (i) \D\ = 1 (so D is /-tight and small), or (ii) \T>\ > 2, and 
the union of any two sets from is a large /-tight set. 

If case (i) holds, then any edge e = st where s G S and t G T is weakly 
saturating for G, since in G U st there cannot be a small /-tight set containing 
S. Assume therefore that case (ii) holds. Let L be a set in T> crossing with T*. 
Then, by Lemma 3 (ii), L* C T is tight and small, which is a contradiction. 

An immediate consequence from Lemma 4 is that any /-connected graph can 
be made (/-I- l)-connected by adding at most f';(G) edges. This is done as follows. 
If G has no weakly saturating edge, we find an /-cover R of size hi (G) by picking 
a node from every minimal small /-tight set. By Lemma 4, R is an /-cover, and, 
by Corollary 1, we can find a forest F on R such that GU F is (/ -I- l)-connected. 
Else, we find and add a weakly saturating edge, and recursively apply the same 
process on the resulting graph. Using appropriate data structures, this can be 
implemented in 0{ln^m) time. 

Theorem 2. For the problem of making a {k—1)- connected graph G k-connected 
by adding a minimum cost set of edges there exists o (2 -|- [^\) -approximation 
algorithm. 
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Proof. At the first phase we reset the edge cost of edges of G to zero, and 
execute OCSA: let G' be the output graph, r the corresponding root, and R = 
rc'{r). Now, consider the graph J = G" U G, and let I = fc — 1. Note that for 
h{J) < L^/2J, since (by Lemma 1 (i)) every /-tight set in G', and thus in J, 
contains at least two nodes from i?, and \R\ = k. At the second phase we make 
J /c-connected by adding an edge set F as in Lemma 4, with I = k — 1. Now, 
c( J) -I- c{F) < 2opt + [k/2\ opt. 

For the min-cost fc-connected subgraph problem one can get an approxima- 
tion ratio slightly better than k by sequentially applying augmentation steps as 
above. That is, we execute OCSA, and from I = \k/2'\ -|- 1 to /c — 1 increase 
the connectivity by 1. At every iteration, ^{G) < vi{G) < where G 

denotes the current graph. By Lemma 4, G can be made (/ -I- l)-connected by 
adding vi(G) edges. By Lemma 2, increasing the number of internally disjoint 
paths between s and t from / to / -I- 1 costs at most . Thus the approximation 
ratio of this algorithm is: 




It is easy to check that I{k) < k for k > 7, but limfc^oo = 1- In fact, the 
same analysis implies: 

Theorem 3. For the problem of increasing the connectivity from kg to k by 
adding a minimum cost set of edges there exists an I{k — ko)- approximation 
algorithm. 

3.2 Metric fc-Connected Subgraph Problem 

In this section we consider the metric min-cost fc-connected subgraph problem. 
We present a modification of the (2+ "^^^^ )-approximation algorithm of Khuller 
and Raghavachari [11] to achieve a slightly better approximation guarantee (2 + 

fc-i \ 
n ' ’ 

Here is a short description of the algorithm of [11]. For / > 3, an l-star is a 
tree with I nodes and I — 1 leaves; the non-leaf node is referred as the center of 
the star. Note that a min-cost subgraph of Q which is /-star with center v can 
be computed in 0{ln) time, and the overall cheapest /-star in 0{ln^) time. The 
algorithm of [11] finds the node set i? of a cheapest fc-star, executes EOCSA, 
and ads to the graph FI calculated the edge set E' as in Corollary 1 (that is, all 
the edges with both endnodes in R that are not in FI). In [11] it is shown that 
c{E') < 

In our algorithm, we make a slightly different choice of R, and add an extra 
phase of removing from E' the noncritical edges (that is we add an edge set F 
as in Corollary 1). We show that for our choice of R, c{F) < We use the 
following simple lemma (for a proof see the full version): 
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Lemma 5. Let G = (R,E) be a complete graph with nonnegative weights w{v), 
V G R, on the nodes. If F is a forest on R then 

''^{w{u) + w{v) : uv G F} < (|i?| — 2) max{w(w) : G i?} + ''^{w{v) : v G R}. 

In our algorithm, we start by choosing the cheapest {k + l)-star Jk+i- Let vq 
be its center, and let its leaves be vi,.. . ,Vk- Denote wq = 0 and Wi = c(voVi), 
i = W.l.o.g. assume that wi < W 2 < • • • < Wk- Note that c{viVj) < 

Wi + Wj, 0 < i ^ j < k. Let us delete Vk from the star. This results in a /c-star 
Jk, and let R be its node set. For such R, let If be the subgraph of Q calculated 
by EOCSA, and let F be an edge set as in Corollary 1, so U F is fc-connected, 
and F is a forest. The algorithm will output HUF. All this can be implemented 
in 0{k^n^m) time. 

Let us analyze the approximation ratio. By [11], c{H) < 2opt. We claim that 
c(F) < ^^opt. Indeed, similarly to [11], using the metric cost assumption it is 
not hard to show that c{Jk+i) = ^ ^ ^ Thus, by our 

choice of Jk, Wk-i = max{'u;(u) : v G R} < ^opt. Using this, the metric costs 
assumption, and Lemma 5 we have: 

c(F) = '^{c{viVj) : ViVj G F} < + Wj : ViVj G F} < 

^ 2 

< {k — 2)wk-i + {w(u) : V G R} < {k — 2)wk-i + {-opt — Wk) < 

n 

, 2 fc-3 2 k-l 

< [k — 6)Wk-i H — opt < opt H — opt = opt. 

n n n n 

Theorem 4. There exists a {2+ -approximation algorithm with time com- 
plexity 0{kfn^m) for the metric min-cost k-connected subgraph problem. 

3.3 Min-Cost 6,7-Connected Subgraphs 

This section presents our algorithms for the min-cost 6, 7-connected subgraph 
problems. The main difficulty is to show that for k = 6,7 we can make the 
output graph of OCSA fc-connected by adding an edge set F with |F| < 2. A 
similar approach was used previously in [5] for /c = 4, 5 with |F| < 1: 

Lemma 6 ([5], Lemma 4.5). Let G be a graph which is k-outconnected from 
r, k G {4,5}. If ja{r) = k, then there exists a pair of nodes s,t G Fair) such 
that GU st is k-connected. 

In fact. Lemma 6 can be deduced from Lemma 1 and the following lemma: 

Lemma 7 ([9], Lemma 3.2). Let G be an l-connected graph. If vi{G) = 2, then 
Ci{G) = (S', T| for some S,T C V{G), where S CT* and T C S*. Thus for any 
s G S and t GT , G\J st is {I -\- 1) -connected. 

Our algorithm for A: = 6, 7 is based on the following Lemma: 
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Lemma 8. Let G he k-outeonnected from r, k G {6,7}. If ^g{t) G {6,7} then 
there exists two pairs of nodes {si, ti}, {s 2 , ^ 2 } Q 7^(7?) such that GU{siti, S 2 t 2 } 
is k-connected. 

Proof. Let G be as in the lemma, and k G {6, 7}. For convenience, let R = rcif), 
H = G\r, and I = k — [ 7 g(?")/ 2 J = k — 3 G {3,4}. By Lemma 1 (ii), G is (/ + 1)- 
connected and H is /-connected. To prove the lemma, it is sufficient to show 
existence of two pairs of nodes {si, G}, {s 2 , ^ 2 } G R such that H U { 51 / 1 , 52 / 2 } 
is (/ -I- 2)-connected. In the proof, the default subscript of the functions P and 7 
is H. 

In what follows, note that, by Lemma l(i), if S is p-tight in H then |S'ni?| > 
k — p = I + 3 — p. Thus vi{H) < 2, and t'/+i(i7) < 3. Recall also that, by the 
definition, no set in C;+i(i7) properly contains a set from Ci{H). The following 
Lemma establishes some structure of the sets in Ci+i{H) (for its proof see the 
full version): 

Lemma 9. (i) If Ci{H) yf 0 then: Ci{H) = {S', T} for some S,T C V, where 
S CT* and T C S* , and for any X G Ci+i(H) either X C S or X CT. 

(ii) Let X,Y gCi+i{H) cross. Then -f{X) = j{Y) = I + 1, -f{X UY) = I, X nY 
is {I + 2) -tight, and either XUY gCi{H), or Ci+i{H) = {X,Y, X* ,Y*}. 

By Lemma l(i), for any S G C;+i(i7) holds |S n i?| > 2. Thus, if no two sets 
in Ci+i{H) cross, then ti+i{H) = < 3. In this case, the statement is a 

straightforward consequence from Corollary 1 . 

Assume now that there exist X,Y G that cross and Ci+i{H) = 

{X, Y, X*,Y*}. By Lemma 9 (ii), A n T is (/ -I- 2)-tight, thus A n F n i? yf 0. 
Let U = {x,y,z}, where x G A*, y G Y* , and z G X C\Y C\ R. Then U covers 
C;+i(i7) = {X,Y, X* ,Y*}, so U is an (/ -|- l)-cover, and U C R. The statement 
in this case follows again from Corollary 1. 

Henceforth assume that for any X,Y G Ci+i{H) that cross XUY G Ci{H), 
and that there exists at least one such pair. By Lemma 9(i), Ci{H) = {S,T} for 
some S,T C V , T C S* and S C T*, and for any A G Ci+i{H) either A C S' or 
X CT. (For the proof of the following lemma see the full version) . 

Lemma 10. Let C he a collection of subsets of S, iy{C) < 2, and let U he a 
C -cover. If for any X,Y G C that cross holds A U F = S, then there is a C -cover 
U' CU with \U'\ < 2. 

Let SS (resp., T) denote all the sets in Ci+i{H) contained in S (resp., in T). 
By Lemma 10, there is a pair { 51 , 52 } G R that covers SS, and there is a pair 
{/i,/ 2 } G R that covers T. 

Lemma 11. The graph H' = H U { 51 / 1 , 52 / 2 } is (/ -I- 2)-connected. 

Proof. Note that H' is (/ -I- l)-connected. Assume to the contrary that H' is not 
(/ -I- 2)-connected. Then there is an (/ -|- l)-tight set A in LT. Note that A is 
also (/ -I- l)-tight in H. Each of A and A* contains a set from Ci+i{H). Thus 
{ 51 , 52 } n A yf 0, and {/i,/ 2 } C A* yf 0. Let us assume w.l.o.g. that si G A. 
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Then ti ^ A*, so t 2 G A*. The latter implies S 2 ^ A. As a consequence, A 
crosses S, and t 2 € A* H T C A* H S* . The latter implies {A U S)* ^ 0. Thus 
'){A U S') > / + 1, as otherwise T C S* fl A* C A*, implying ti G A* . Now, using 
(1) we obtain a contradiction: 

(l + l) + l = j{A) + 7 (S) > 7 (A n S) + 7 (A U S) > (/ + 1) + (I + 1). 

The proof of Lemma 8 is done. 

Two pairs {si,ti}, { 32 ,^ 2 } as in Lemma 8 can be found in 0{m) time, e.g., 
by exhaustive search. Combining this and Lemma 8 we obtain: 

Theorem 5. For k = 6,7, there exists a A- approximation algorithm for the 
min-cost k-connected subgraph problem. The time complexity of the algorithm 
is 0{rAm) deterministic (using OCSA) and 0{n^m\ogn) randomized (using 
ROCS A). 

3.4 Fast Algorithm for k — 4 

In this section we present a 3-approximation algorithm for fc = 4 with complexity 
O(n^). This improves the previously best known time complexity 0(n^) [5]. Let 
us call a subset R of nodes of a graph G fc-connected if for every u,v € R there 
are k internally disjoint paths between u and v in G. The following Theorem is 
due to Mader. 

Theorem 6 ([16]). Any graph on n>5 nodes with minimal degree at least k, 
k >2, contains a k-connected subset R with |i?| = 4. 

It is known that the problem of finding a min-cost spanning subgraph with 
minimal degree at least k is reduced to the weighted 6-matching problem. Using 
the algorithm of Anstee [1] for the latter problem, such a subgraph can be 
found in 0{vfm) time. We use these observations to obtain a 3-approximation 
algorithm for fc = 4 as follows. The algorithm has two phases. At phase 1, among 
the subgraphs of Q with minimal degree 4, we find an optimal one, say F. Then, 
we find in F a 4-connected subset R with |i?| = 4. At phase 2, we execute 
EOCSA on R, and let be its output. Finally, the algorithm will output FI OF. 

Theorem 7. There exists a 3 -approximation algorithm for the min-cost 4-con- 
nected subgraph problem, with time complexity 0{n^m-\- nT{n)) = 0(n‘*), where 
T{n) is the time required for multiplying two n x n-matrices. 

Proof. The correctness of the algorithm follows from Theorem 6, Lemma 1 (i), 
and Corollary 1. To see the approximation ratio, recall that c(i7) < 2opt, and 
note that c{F) < opt. 

We now prove the time complexity. The complexity of each step, except 
of finding a 4-connected subset in F is 0{ufm). Let us show that finding a 4- 
connected subset can be done in 0{ufm-\-n{T{ny) time. Using the Ford-Fulkerson 
max-flow algorithm, we construct in 0{n^m) time the graph J = (U, E'), where 
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(s, t) G E' if and only if there are 4 internally disjoint paths between s and t in F . 
Now, S' is a 4-connected subset in F if and only if the subgraph induced by S in 
J is a complete graph. Thus, finding S as above is reduced to finding a complete 
subgraph on 4 nodes in J. This can be implemented as follows. Observe that 
S = {s, u, V, rc} induces a complete subgraph in J if and only if {u, v, re} form a 
triangle in the subgraph induced by Fj(s) in J. It is well known that finding a 
triangle in a graph is reduced to computing the square of the incidence matrix 
of the graph. The best known time bound for multiplying two n x n matrices is 
[4], and the time complexity follows. 

3.5 Metric Multiroot Problem: Cases k < 7 

In this section we consider the metric-cost multiroot problem. Note that here 
5 is a complete graph, and every edge in Q has cost at most opt/k. This is 
since any feasible solution contains at least k edge disjoint paths between any 
two nodes s and t, and, by the metric cost assumption, each one of these paths 
has cost > c{st). For fc < 7, we give an algorithm with approximation ratio 
2 _|_ < 2.5. This improves the previously best known approximation 

ratio 3 [3]. Our algorithm combines some ideas from [3], [2,5], and some results 
from the previous section. 

Splitting off two edges ru, rv means deleting ru and rv and adding a new 
edge uv. 

Theorem 8 ([3], Theorem 17). Let G = {V,E) be a graph which is k-out- 
connected from a root node r €V, and suppose that ^g{t) > k+2 and every edge 
incident to r is critical w.r.t. k-outconnectivity from r. If G is not k-connected, 
then there exists a pair of edges incident to r that can he split off preserving 
k-outconnectivity from r. 

Consider now an instance of a metric cost multiroot problem, and let r be a 
node with the maximum requirement k. As was pointed in [3], Theorem 8 implies 
that we can produce a spanning subgraph G of Q, such that G is /c-outconnected 
from r, c(G) < 2opt, and: G is fc-connected, or 7 g(?’) € {k,k 1}. To handle 
the cases k = 5,7, we show that by adding one edge, we can reduce the case 
7 (r) = fc -I- 1 to the already familiar case 7 (r) = k. (For a proof of the following 
lemma see the full version.) 

Lemma 12. Let G = (V, E) be k-outconnected from a root node r G V, let 
R = Fair), and letrx be critical w.r.t. k-outconnectivity from r. If^cif) > fc+1, 
then there exists a node y G R such that (G \ rx) U xy is k-outconnected from r. 

Lemma 13. Let G be a graph which is k-outconnected from r, 3 < A: < 7, and 
suppose that 7g(?’) G {k,k -\- 1}. Then there is an edge set F C {uv : u v G 
Gg(?’)} such that: GU F is k-connected and |F| < [{k — 1)/2J . 

Proof. For fc < 4, this is a straightforward consequence from Lemmas 1 and 7. 
For fc = 6 this is a consequence from Lemma 8. For k = 5, 7, it can be easily 
deduced using Lemma 12 and: Lemma 6 for fc = 5, or Lemma 8 for k = 7. 
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Using Lemma 13 and the fact that c{st) < opt/k for every s,t G V, we 

deduce: 

Theorem 9. For the metric cost multiroot problem with 3 < k <7, there exists 

a {2 + ) -approximation algorithm with time complexity 0{n^m). 
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Abstract. We consider two tiling problems for two-dimensional arrays: given 
an n X n array A of nonnegative numbers we are to construct an optimal partition 
of it into rectangular subarrays. The subarrays cannot overlap and they have to 
cover all array elements. The first problem (RTILE) consists in finding a partition 
using p subarrays that minimizes the maximum weight of subarrays (by weight 
we mean the sum of all elements covered by the subarray). The second, dual 
problem (DRTILE), is to construct a partition into minimal number of subarrays 
such that the weight of each subarray is bounded by a given value W. 

We show a linear-time 7/3-approximation algorithm for the RTILE problem. 
This improves the best previous result both in time and in approximation ratio. 
If the array A is binary (i.e. contains only zeroes and ones) we can reduce the 
approximation ratio up to 2. For the DRTILE problem we get an algorithm which 
achieves a ratio 4 and works in linear-time. The previously known algorithm with 
the same ratio worked in time 0{tf ). For binary arrays we present a linear-time 
2-approximation algorithm. 



1 Introduction 

We consider two optimization tiling problems. 

Problem RTILE: 

Input: nx n array of nonnegative numbers and a positive integer p. 

Task: Partition A into p rectangular nonoverlapping subarrays such that the maximum 
weight of the subarrays is minimal. 

Problem DRTILE: 

Input: nx n array of nonnegative numbers and a positive number V. 

Task: Partition A into minimal number of rectangular nonoverlapping subarrays such 
that the weight of each subarray is not greater than V. 

These problems are very attractive for at least two reasons. First, they have very 
simple classical definitions; second, they are general enough to capture many prob- 
lems naturally arising in many application areas. These areas include among others 
load balancing in parallel computing environments, data compression, or building two- 
dimensional histograms. The interested reader can find more details on the applications 
in e.g. [4] and [7]. 

The class of tiling problems is very wide ([7]). The problems it includes differ in 
array dimensions, restrictions on the values of array elements, definitions of metric 
functions, types of tiles, etc... With respect to the last issue we distinguish three main 

* Partially supported by Komitet Badan Naukowych, grants 8 TllC 032 15 and 8 TllC 044 19. 
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types of tilings: px p- where p^ tiles are induced by choosing p vertical and p horizon- 
tal lines ([5], [2]), hierarchical - where the first step of partitioning is done by choosing 
some lines in one direction and then the resulting subarrays are divided recursively [6], 
and arbitrary - where no restriction on tiles is imposed [4], 

For one-dimensional arrays there are known polynomial-time algorithms yielding 
exact solutions. In fact, the DRTILE problem can be solved in linear time by a simple 
greedy algorithm. For the RTILE problem the dynamic programming strategy is more 
suitable. It allows to solve the problem in time 0{np). Interestingly, there is another 
approach which results in an algorithm working in time O(min{n-|-p^+’^,«logn}) ([5]). 
This beats the dynamic algorithm in the case of p = o{n^). 

The difficulty of the problems radically changes when we extend them to two 
dimensions. Grigni and Manne [2] proved that optimal p x p tiling is NP-hard and 
Charikar et.al. [1] showed that in fact it is NP-hard to approximate within a factor of 
2. Eor arbitrary tilings Khanna, Muthukrishnan and Paterson [4] showed that both the 
RTILE and DRTILE problems are NP-hard even for the case when the array elements 
are integers bounded by a constant. Moreover, RTILE remains NP-hard when we relax 
our demands and look for solutions that are within a factor 5/4 of the optimal one. 

Then in [4], the authors construct several efficient approximation algorithms. Eor the 
RTILE problem they present a 5 /2-approximation algorithm that works in time 0{n^ + 
plogn) and mention that using a similar technique they obtain a 9/4-approximation 
algorithm of the same time complexity for binary arrays, i.e. arrays with elements from 
the set {0, 1}. Eor the DRTILE problem, they develop a technique of a Hierarchical Bi- 
nary Tiling and using it construct a 4-approximation algorithm working in time 0{n^). 
They also show that modifying this technique one can obtain a polynomial-time 2- 
approximation algorithm. Unfortunately, the polynomial is of prohibitively high degree. 
More practical is an algorithm they construct using a partitioning technique. It works 
in time 0{n^ + plogn) and achieves the approximation ratio 5 for arbitrary arrays and 
9/4 for binary arrays. 

New Results. We improve all the above results. Eirst, we show a 7/3-approximation 
algorithm that solves the RTILE problem. The technique, we use is a kind of greedy 
strategy that divides the array into strips that can be efficiently partitioned. On binary 
arrays the same algorithm achieves the ratio 2. Thus the algorithm beats the results from 
[4] not only in approximation ratio but also in time, which is linear. Another advantage 
over the algorithms from [4] is the simplicity of the produced partitions. Whereas the 
partitions from [4] are arbitrary, our algorithm gives hierarchical ones with a small depth 
(at most 3). 

For the DRTILE problem we construct a linear-time 4-approximation algorithm. 
This improves either time or approximation ratio of the best practical algorithms from 
[4]. In the case of binary arrays we obtain the ratio 2. 

2 Preliminaries 

Throughout the paper we will use the following denotations and definitions: 

- by the weight wg{A) of an array A we mean the sum of all its elements; 

- W = max{ [ , max{a,-j : 1 < «,/<«}}; 
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- to k-partition an array means to partition it into rectangles of weight not exceeding 
kW; 

- if the weight wg{A) of an array A fulfills IW < wg{A) < (/+ \)W (for some integer 
/ > 0) then to k-partition A well means to k-partition it into at most I rectangles; 

- we say that a column is of type < (>, resp.) if its weight is less than W (not less 
than W, resp.); 

- by a < -group we mean a maximal subarray whose all columns are of type < and 
whose total weight is less than W ; 

- unless stated otherwise, we shall not distinguish between <-columns and <-groups; 

- by a >'-group we mean a maximal subarray whose all columns are of type < and 
whose total weight is at least W ; 

- then we extend this notation to subarrays, eg. we say that a subarray is of type >< 
if its first column is of type > and is followed by a <-group (in particular, a single 
< -column). 

First we observe that well-partitioning of subarrays implies well partitioning the 
whole array. 

Proposition 1. If an array B consists of the disjoint subarrays Bi , . . - ,Bj, and each Bi 
has been well k-partitioned, then B has also been well k-partitioned. Moreover, each 
good k-partition is a good l-partitionfor any I >k. 



3 The RULE Problem 

3.1 General Case 

We start with simple subarrays that will be tiled by our algorithm independently. 
Lemma 1. A single column k of weight 

wg{k) < (m-\- l)W, for some integer m>\ 
can be 2-partitioned into at most m rectangles. 

Proof. The proof is by induction on m. If wg{k) < 2VF then k does not have to be 
partitioned at all. 

Let us now assume that we can 2-partition a column of weight wg < jW into at most 
J—1 rectangles for any 2< j <m. Let segment s represent column k of weight wg{k) < 
{m-\-l)W. We divide s into intervals of lengths proportional to the weights of nonzero 
elements. 

If the line cutting off a unit segment (of length equal to W) falls inside some in- 
terval, then we can move it to the right and obtain a rectangle of weight not exceeding 
2W (see Fig.l). Thus the remaining part of s has length < mW and therefore can be by 
assumption partitioned into m — 1 rectangles. I 

The proof of Lemma 1 can be easily extended to any >'-group R, by treating 
columns of R as single elements. Therefore we have 
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Fig. 1. The right end of the first unit segment falls inside a,, so we move it to the end of this 
interval. 

Corollary 1. Any >' -group R can be 2-partitioned into m rectangles, where mW < 
wg{R) < (m+ l)W. 

Lemma 2. Any two-column subarray of weight at least W can be well 7 /3-partitioned. 

Proof. 

Claim. (A) Any two-column subarray S of weight 

W < w(5) < 31T 

can be well 7/3-partitioned. 

Proof of Claim(A ). If S could not be well 7 /3-partitioned horizontally, then there would 
exist a row b such that (see Fig. 2) 

a-\-b>l /3W and b-\-c>l /3W 

which would imply that b > \ |lF. Since b\ < W, then 7>2 > and S could be well 
7/3-partitioned vertically. I 



b, : b. 



Fig. 2. 



Claim. (B) Any two-column subarray S of weight 

3W < w(S) < AW 



can be well 2-partitioned. 
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Proof of Claim(B). Let subarray S be represented by a segment s of length wg{S)/W, 
similarly as in the proof of Lemma 1 . We divide s into intervals of lengths proportional 
to the weights of nonzero rows. If the line / dividing s in half falls inside some interval 
b, then we partition s into a,b,c, otherwise s is well 2-partitioned by the line / (see 

Fig. 3). I 



1 



P- 



-9 9 



a 




Fig. 3. 






♦ 



Let us now assume that we can well 7/3-partition any two-column subarray S of 
weight 

mW < w(5') < {m+ \)W 

for every integer m such that 1 <m<n. 

Let S' be a two column subarray of weight 

{n+l)W <w{S') < {n + 2)W 

for some «-|- 1 > 4. Let i be such that w(rir 2 ...r,-i) < W and vv(nr 2 ...r,) > W, where 
r, denotes the i-th row of subarray S' . Since w(r,) < 21L we have w(rir 2 ...r,) < 3W and 
w(r,+i ...rj) > {n + I — 3)W > W. Therefore by the above claims and by induction we 
can well 7 /3-partition both parts, and by Proposition 1 we have a good 7 /3-partition of 
the whole S'. I 

Note that the proof remains valid if we replace a <-column by a <-group. Indeed, 
we can treat the group as one < -column whose elements are obtained by summing 
appropriate rows. Since subarrays of type <> or >< have weight at least W we imme- 
diately get: 

Corollary 2. Any subarray of type <> or >< can be well 1 j^-partitioned. 

Now we are ready to present the algorithm. 

Algorithm. At the very beginning the algorithm scans the array form left to right and 
divides it into vertical strips. Each strip is either a single >-column, a >'-group or a 
<-group (perhaps consisting of one <-column). Note that due to maximality of groups, 
no two groups are adjacent. In particular, no two consecutive strips are < -columns and 
any >'-group abuts only on >-columns. Then the algorithm rescans the array and uses 
the methods from Lemmata 1 and 2 to partition single strips or pair of strips. 

If the strip under consideration is a > -column followed by another > -column or a 
>'-group, then it is dealt with by the method from Lemma 1 . The same method is used 
for a >'-group. 
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If the studied strip is a > -column and is followed by a < -group then we make use 
of the method from Lemma 2, which naturally is also used if the order is reverted, i.e. 
when the first strip, is a <-group and the second is a >-column. If at the end we are left 
with a <-group, then it will form the last rectangle. 

Lemma 3. The above algorithm yields 7/3 approximation. 

Proof. 

The only doubtful situation arises when the last rectangle in a partition is used to 
cover a <-group. It may seem that we can go beyond the allowed number p of rectan- 
gles. We show that this is not the case. 

Suppose that A = [Ai,A 2 ], A\ has been well 7/3-partitioned and 0 < w{A 2 ) < 
W. The number of rectangles used for covering Ai is at most . Since w(A) = 

w(Ai) +w{A 2 ) > JVL + w{A 2 ) and since p > by dehnition, we conclude 

thatp > L^^J- 

I 

In this way we have proved our main result. 

Theorem 1. There exists a linear-time 7 /3-approximation algorithm for the RTILE 
problem. 

3.2 Binary Arrays 

Now we show that the algorithm achieves the approximation factor 2 on binary arrays. 

Theorem 2. There exists a linear-time 2-approximation algorithm for the RTILE prob- 
lem on binary arrays. 

To prove this it suffices to sharpen Corollary 2. Note however, that for subarrays of 
type <> or >< we cannot now draw a conclusion from lemmata stated for two-column 
subarrays, because for binary arrays <-group can be no longer treated as a single binary 
column. For this reason we have to reformulate Lemma 2. 

Lemma 4. Let S be a two-column subarray, whose one column is binary and the second 
column contains elements not greater than W — 1. Then S can be well 2 -partitioned. 

Proof. 

Note that if p > w(A) then one can easily partition A into w(A) rectangles so that 
each of them contains exactly one element equal to 1. Obviously such a partition is 
optimal. 

Therefore let W — > 2. Clearly W < OPT, where OPT is the value of the 

optimal solution. 

A careful analysis of the proof of Lemma 2 makes it obvious that it suffices to show 
the lemma for arrays of weight less than 3W. 

Claim. Any two-column binary subarray S of weight 

W < w{S) < 3W 



can be well 2-partitioned. 
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Proof of Claim. Assume that S cannot be well 2-partitioned horizontally, then there 
exists a row b such that a + b > 2W and b + c > 2W, which implies b > W. We get a 
contradiction, because both b\ <\ and i >2 < W — 1 (or vice versa). I 

(of Claim and Lemma 4). 



4 DRTILE 

In this section we show two approximate solutions to the DRTILE problem. 

Theorem 3. There exists a linear-time 2-approximation algorithmfor the DRTILE prob- 
lem restricted to binary arrays. 

Proof. Let A be a binary array and let V be the limit on the weight of rectangles used 
in partitions. We can assume that V is greater than 2, because otherwise we can apply a 
straightforward algorithm dividing A into w(A) rectangles, each containing a single 1 . 
Moreover, we can assume that w(A) > V. 

Let po = [w(A) /y] . Obviously po is a lower bound on optimal solution to DRTILE 
on A. The algorithm from Theorem 2 called with the parameter p = 2po gives a par- 
tition of A into at most 2po rectangles. Since w{A)/p > 1 = max{a,y}, for V > 3, the 
weight of any of these rectangles is not greater than 2w(A) j p = w{A)/ pQ <V. I 

Due to large elements that could appear in the array the same method cannot be ap- 
plied for the general case. However, we show that small modifications to the algorithm 
from Theorem 1 suffice to solve the problem. 

Theorem 4. There exists a linear-time 4-approximation algorithmfor the DRTILE prob- 
lem. 

Proof. Let A be an n x n array and let V be the limit on the weight of rectangles used in 
partitions. We start with applying the same preprocessing procedure as for the RTILE 
problem (with VT = L, we are allowed to do so, because no element can have weight 
greater than V). Let us remind that it compresses <-columns so that the resulting array 
contains no two consecutive columns of type <. Let k be the total number of > -columns 
and >'-groups after the preprocessing. Obviously k is not greater than w(A) /V < OPT . 

Now we apply a simple greedy procedure to partition > -columns. The procedure 
scans a column computing the sum of the scanned elements. When the current sum is 
to exceed V , it forms a rectangle, resets the counter and resumes scanning. Since each 
pair of two consecutive rectangles created in this way has a total sum greater than V, 
the column is divided into at most 2m A 1 rectangles, where m is the largest integer 
such that mY is not greater than the weight of the column. In a similar way we divide 
>'-groups. Thus the total number of rectangles used for covering >-columns and >'- 
groups is not greater than 2M A k where MV lower bounds the total weight of elements 
placed in these areas. 

If we cover each <-column by a separate rectangle, then the total number of rect- 
angles used is bounded by 2M AkAt, where t is the number of <-columns in A. Note 
that kAt is not greater than 20PT. Indeed, if A: = OPT , then after preprocessing there 
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is no <-column at all, and otherwise the number of <-columns is at most k+ 1. Since 
MV <w{A), M <w{A) /V < O/’r, and therefore 2M + A: + 1 < 

I 



5 Conclusions 

We have presented new solutions to the RTILE and DRTILE problems for both arbitrary 
and binary arrays. They improve the best known results from [4] in several aspects: time, 
approximation ratios (with exception of DRTILE for general matrices where only time 
is improved) and simplicity of the produced partitions (they are hierarchical with the 
hierarchy depth bounded to 3, whereas the partitions from [4] are arbitrary). 

However there is still a big gap between the lower bound 5/4 on approximability 
and our results. We believe that some further progress can be achieved by slight modifi- 
cations of our methods. In particular, the approximation ratio 2 seems to be achievable 
for the general case of the RTILE problem. Achieving better approximation factors can- 
not also be excluded, but it will require at least different, likely more involved, methods 
of analysis than we have used. In our proofs the value W served to express a lower 
bound for the value of optimal solution. As we note below this limits applicability of 
our methods. 

Claim. There are instances of the RTILE problem for which the optimal solution is 
arbitrarily close to 2W. 

Proof of Claim. Let an 1 x {2m + 1) array A be dehned as follows: 

, r ., f 1 for odd i 

= ) 1 f 

( - for even i 

and let p = m. Then w{A) = m + 2 and W = 1 + 2/m. On the other hand the value of 
optimal solution is equal to 2 -f 1 /m = 2W — 3 /m. I 
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Abstract. We study several old and new algorithms for computing 
lower and upper bounds for the Steiner problem in networks using dual- 
ascent and primal-dual strategies. We show that none of the known algo- 
rithms can both generate tight lower bounds empirically and guarantee 
their quality theoretically; and we present a new algorithm which com- 
bines both features. The new algorithm has running time O(relogn) 
and guarantees a ratio of at most two between the generated upper and 
lower bounds, whereas the fastest previous algorithm with comparably 
tight empirical bounds has running time O(e^) without a constant ap- 
proximation ratio. Furthermore, we show that the approximation ratio 
two between the bounds can even be achieved in time 0{e -\- nlogn), 
improving the previous time bound of 0(n^ logn). 

Keywords: Steiner problem; relaxation; lower bound; approximation al- 
gorithms; dual-ascent; primal-dual 



1 Introduction 

The Steiner problem in networks is the problem of connecting a set of required 
vertices in a weighted graph at minimum cost. This is a classical A/"7^-hard prob- 
lem with many important applications in network design in general and VLSI 
design in particular (see for example [12]). 

For combinatorial optimization problems like the Steiner problem which can 
naturally be formulated as integer programs, many approaches are based on 
linear programming. For an AfT^-hard problem, the optimal value of the linear 
programming relaxation of such a (polynomially-sized) formulation can only 
be expected to represent a lower bound on the optimal solution value of the 
original problem, and the corresponding integrality gap (which we define as the 
ratio between the optimal values of the integer program and its relaxation) is a 
major criterion for the utility of a relaxation. For the Steiner problem, we have 
performed an extensive theoretical comparison of various relaxations in [17]. 

To use a relaxation algorithmically, many approaches are based on the LP- 
duality theory. Any feasible solution to the dual of such a relaxation provides 
a lower bound for the original problem. The classical dual-ascent algorithms 
construct a dual feasible solution step by step, in each step increasing some dual 
variables while preserving dual feasibility. This is also the main idea of many 
recent approximation algorithms based on the primal-dual method, where an 
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approximate solution to the original problem and a feasible solution to the dual 
of an LP relaxation are constructed simultaneously. The performance guarantee 
is proved by comparing the values of both solutions [1 1] . 

In this paper we study some old and new dual-ascent based algorithms for 
computing lower and upper bounds for the Steiner problem. Two approximation 
ratios will be of concern in this paper: the ratio between the upper bound and 
the optimum, and the ratio between the (integer) optimum and the lower bound. 
The main emphasis will be on lower bounds, with upper bounds mainly used in 
a primal-dual context to prove a performance guarantee for the lower bounds. 
Despite the fact that calculating tight lower bounds efficiently is highly desirable 
(for example in the context of exact algorithms or reduction tests [18, 5, 15]), this 
issue has found much less attention in the literature. For recent developments 
concerning upper bounds, see [18]. 

After some preliminaries, we will discuss in Section 2 the classical primal- 
dual algorithm for the (generalized) Steiner problem based on an undirected 
relaxation. In Section 3, we study a classical dual-ascent approach based on a 
directed relaxation, and show that it cannot guarantee a constant approximation 
ratio for the generated lower (or upper) bounds. In Section 4, we introduce a 
new primal-dual algorithm based on the directed relaxation which guarantees a 
ratio of at most 2 between the upper and lower bounds, while producing tight 
lower bounds empirically. Section 5 contains some concluding remarks. 

Detailed computational experiments and some additional explanations and re- 
sults are given in [19]. 

Preliminaries 

For any undirected graph G = (V) E), we define n := |y |, e := \E\, and assume 
that (vi,Vj) and (vj,Vi) denote the same (undirected) edge {vi,vj}. A network 
is here a weighted graph (V, E, c) with an edge weight function c : if — > R. For a 
subgraph H of G, we abuse the notation c{H) to denote the sum of the weights 
of the edges in ii with respect to c. For any directed network G = (V, A, c), we 
use [vi,Vj] to denote the arc from Vi to Vj; and define a := |A|. 

The Steiner problem in networks can be formulated as follows: Given a 
network G = (V,E,c) and a non-empty set R, R C V, of required vertices 
(or terminals), find a subnetwork Ta{R) of G such that in Ta{R), there is a 
path between every pair of terminals, and c{Tg{R)) is minimized. The directed 
version of this problem is defined similarly (see [12]). Every instance of the 
undirected version can be transformed into an instance of the directed version in 
the corresponding bidirected network, fixing a terminal z\ as the root. We define 
R\ := R \ {zi} and r := |i?|; and assume that r > 1. If the terminals are to 
be distinguished, they are denoted by zi,...,Zr- Without loss of generality, we 
assume that the edge weights are positive and that G is connected. Now Tg{R) 
is a tree. A Steiner tree is an acyclic, connected subnetwork of G, including R. 

By computing a minimum spanning tree for Dq{R), the distance network 
with the vertex set R with respect to G, and replacing its edges with the cor- 
responding paths in G, we get a feasible solution for the original instance; this 
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is the core of the well-known heuristic DNH (Distance Network Heuristic; see 
[12]) with a worst case performance ratio of (2 — 2/r). Mehlhorn [16] showed how 
to compute such a tree efficiently by using a concept similar to that of Voronoi 
regions in algorithmic geometry. For each terminal z, we define a neighborhood 
N{z) as the set of vertices which are not closer to any other terminal (ties broken 
arbitrarily). Consider a graph G' with the vertex set R in which two terminals 
Zi and Zj are adjacent if in G there is a path between zi and Zj completely in 
N{zi) U N{zj), with the cost of the corresponding edge, c'{{zi,zj)), being the 
length of a shortest such path. A minimum spanning tree T' for G' will be also a 
minimum spanning tree for Dg{R). The neighborhoods N(z) for all z € R, the 
graph G' and the tree T' can be constructed in total time 0{e + nlogn) [16]. 

A cut in G = (V) A, c) (or in G = (V,E,c)) is defined as a partition 
G = {W,W) oiV {% C W C V;V_= WOW). We use 5~{W) to denote the 
set of arcs [vi,Vj] G A with Vi € W and vj G W. The sets 5^{W) and, for 
the undirected version, 6{W) are defined similarly. For simplicity, we sometimes 
refer to these sets of arcs (or edges) as cuts. A cut G = (kF, W) is called a 
Steiner cut, if zi G W and i?i n IF yf 0 (for the undirected version: i?n IF yf 0 
and i? n IF yf 0). The (directed) cut formulation Pq [22] uses the concept of 
Steiner cuts to formulate the Steiner problem in a (in this context bidirected) 
network G = (F, A, c) as an integer program. In this program, the (binary) vec- 
tor X represents the incidence vector of the solution. 

Hp(^A ^(P)^p subject to: 

J2pe5-{w) ^ 1 for all Steiner cuts (IF, IF); x G {0, 1}“ . 

The undirected cut formulation Puc is defined similarly [1] . For an integer pro- 
gram like Pc, we denote with LPc the corresponding linear programming re- 
laxation and with DLPc the program dual to LPc; and we use v{Q) to denote 
the value of an optimal solution of a (linear or integer) program Q. Introducing 
a dual variable yw for each Steiner cut (IF, IF), we have: 

Max ^ yw subject to: 

Ely. pe 5 -(w) yw < c{p) for all p G A; y>0 . 

The constraints ^yw < c(p) are called the (cut) packing constraints. 

2 Undirected Cuts: A Primal-Dual Algorithm 

Some of the best-known primal-dual approximation algorithms are designed for a 
class of constrained forest problems which includes the Steiner problem (see [10]). 
These algorithms are essentially dual-ascent algorithms based on undirected cut 
formulations. For the Steiner problem, such an algorithm guarantees an upper 
bound of 2 — 2/r on the ratio between the values of the provided primal and 
dual solutions. This is the best possible guarantee when using the undirected 
cut relaxation LPjjc, since it is easy to construct instances (even with r = n) 
where the ratio v{Puc) / v{LPuc) is exactly 2 — 2/r (see for example [8]). In the 
following, we briefly describe such an algorithm when restricted to the Steiner 
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problem, show how to make it much faster for this special case, and give some new 
insights into it. We denote this algorithm with PDuc {PD stands for Primal- 
Dual and UC stands for Undirected Cut). 

The algorithm maintains a forest F, which initially consists of isolated ver- 
tices of V. A connected component S' of U is called an active component if (S, S) 
defines a Steiner cut. In each iteration, dual variables corresponding to active 
components are increased uniformly until a new packing constraint becomes 
tight, i.e. the reduced cost c{p) —^ys of some edge p becomes zero, which is 
then added to F (ties are broken arbitrarily). The algorithm terminates when 
no active component is left; at this time, F defines a feasible Steiner tree and 
S(g S) Steiner cut represents a lower bound on the weight of any Steiner tree 
for the observed instance. In a subsequent pruning phase, every edge of F which 
is not on a path (in F) between two terminals is removed. In [10], it is shown how 
to make this algorithm (for the generalized problem) run in O(n^logn) time; 
see also [6, 13] for some improvements. 

When restricted to the Steiner problem and as far as the constructed Steiner 
tree is considered, the algorithm PDjjc is essentially the DNH (Section 1), im- 
plemented by an interleaved computation of shortest paths trees out of terminals 
and a minimum spanning tree for the terminals with respect to their distances. 
In fact, every Steiner tree T provided by Mehlhorn’s 0{e + nlogn) time imple- 
mentation of DNH can be considered as a possible result of PDjjc- We observed 
that even the lower bound calculation can be performed in the same time: Let 
T' be a minimum spanning tree for R provided by Mehlhorn’s implementation 
of DNH and let , . . . , e (,_2 be its edges in nondecreasing cost order. The algo- 
rithm PDjjc increases all dual variables corresponding to the initially r active 



components by 



c (A ) 



then the components corresponding to the vertices of 



are merged. The dual variables of the remaining r — 1 components are increased 

c (ei) jg possibly zero) before the next two components are 

merged, and so on. Therefore, the lower bound provided by PDjjc is (defining 
c'(e'o) := 0) simply ~ i + = i(c'(e(,_^) -|- c'(e')) = 

\{c'{e'^_i) + c'(T')), which can be computed in 0{r) time once T' is available. 

From this new viewpoint at PDjjc we get some insight about the gap be- 
tween the provided upper and lower bounds. Assuming that the cost of T' is not 
dominated by the cost of its longest edge and that the Steiner tree corresponding 
to T' is not much cheaper than T' itself (which is usually the case), the ratio 
between the upper and lower bound is nearly two; and this suggests that either 
the lower bound, or the upper bound, or both are not really tight. 

Empirically, results on different types of instances (from SteinLib [14]) show 
an average gap of about 45% (of optimum) between the the upper and the 
lower bounds calculated by PDjjc- This is in accordance with the relation we 
established above between these two values. This gap is mainly due to the lower 
bounds, where the gap to optimum is typically over 30%. So although this heuris- 
tic can be implemented to be very fast empirically (small fractions of a second 
(on a Pentium II 450 MHz PC) even for fairly large instances (several thousands 
of vertices)), it is not suitable for computing tight bounds. 
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3 Directed Cuts: An Old Dual-Ascent Algorithm 

In the search for an approach for computing tighter lower and upper bounds, 
the directed cut relaxation is a promising alternative. Although no better upper 
bound than the 2 — 2/r one from the previous section is known on the integrality 
gap of this relaxation, the gap is conjectured to be much closer to 1, and the 
worst instance known has an integrality gap of approximately 8 /7 [7] . There are 
many theoretical and empirical investigations which indicate that the directed 
relaxation is a much stronger relaxation than the undirected one (see for example 
[2, 3]). In [18], we could achieve impressive empirical results (including extremely 
tight lower and upper bounds) using this relaxation. In that work, extensions of 
a dual ascent algorithm of Wong [22] played a major role. Although many works 
on the Steiner problem use variants of this heuristic (see for example [5, 12, 21]), 
none of them includes a discussion of the theoretical quality of the generated 
bounds. In this section, we show that none of these variants can guarantee a 
constant approximation ratio for the generated lower or upper bounds. 

The dual-ascent algorithm in [22] is described using the multicommodity flow 
relaxation. Here we give a short alternative description of it as a dual-ascent 
algorithm for the (equivalent) relaxation LPc, which we denote with DAq. The 
algorithm maintains a set H of arcs with zero reduced costs, which is initially 
empty. For each terminal Zt G i?i, define the component of Zt as the set of all 
vertices for which there exists a directed path to Zt in i7. A component is said to 
be active if it does not contain the root. In each iteration, an active component 
is chosen and the dual variable of the corresponding Steiner cut is increased until 
the packing constraint for an arc in this cut becomes tight. Then the reduced 
costs of the arcs in the cut are updated and the arcs with reduced cost zero are 
added to H. The algorithm terminates when no active component is left; at this 
time, H (regarded as a subgraph of G) is a feasible solution for the observed 
instance of the (directed) Steiner problem. To get a (directed) Steiner tree, in 
[22] the following method is suggested: Let Q be the set of vertices reachable 
from Zi in H . Compute a minimum directed spanning tree for the subgraph of 
G induced by Q and prune this tree until all its leaves are terminals. In [12], this 
method is adapted to the undirected version, mainly by computing a minimum 
(undirected) spanning tree instead of a directed one. 

In [5], an implementation of DAc with running time 0(amin{a,rn}) is de- 
scribed. Although the algorithm usually runs much faster than this bound would 
suggest, we have constructed instances on which every dual-ascent algorithm fol- 
lowing the same scheme must perform 6*(n^) operations. 

To show that the lower bound generated by DAc can deviate arbitrarily 
from v{LPc), two difficulties must be considered. The first one is the choice 
of the root: although the value v{LPc) for an instance of (undirected) Steiner 
problem is independent of the choice of the root [9], the lower bound gener- 
ated by DAc is not, so the argumentation must be independent of this choice. 
The second difficulty is the choice of an active component in each iteration. In 
the original work of Wong [22] , the chosen component is merely required to be 
a so-called root component. A component S corresponding to a terminal Zt is 
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called a root component if for any other terminal Zs in this component, there 
is a path from zt to Zs in H . This is equivalent to S being a minimal (with 
respect to inclusion) active component. An empirically more successful variant 
uses a size criterion: at each iteration, an active component of minimum size is 
chosen (see [5, 18]). Note that such a component is always a root component. 
So, in this context it is sufficient to study the variant based on the size criterion. 

Example 1. In Figure 1, there are -I- c-l- 1 terminals 
(filled circles); the top terminal is considered as the 
root. The edges incident with the left c terminals have 
costs c^, all the other edges have costs c. According 
to the size criterion, each of the terminals (i.e. their 
components) at the left is chosen twice before any of 
the terminals at the bottom can be chosen a second 
time. But then, there is no active component anymore 
and the algorithm terminates. So, the lower bound 
generated by DAq is in 0(c^). On the other hand, it 
is easy to see that for this instance: v{LPc) = v{Pc) S 

Now imagine c copies of this graph sharing the 
top terminal. For the resulting instance, we have 
v{LPc) = v{Pc) S 6>(c®); but the lower bound gen- 
erated by DAc will be in 6>(c^) independent of the 
choice of the root, because the observation above will 
remain valid in at least c — 1 copies. 

Now we turn to upper bounds: By changing the costs of the edges incident to the 
left terminals from to c -I- e (for a small e) in Figure 1 we get an instance for 
which the ratio between the upper bound calculated by the algorithm described 
in this section and v{Pc) can be arbitrarily large. This is also the case for all 
other approaches in the literature for computing upper bounds based on the 
graph H provided by DAc, because v{Pc) € 6>(c^) for such an instance, but 
there is no solution with cost o(c^) in the subgraph H generated by DAq- 

Despite its bad performance in the worst case, the algorithm typically pro- 
vides fairly tight lower bounds, with average gaps ranging from a small fraction 
of a percent to about 2%, depending on the type of instances. The upper bounds 
are not good, with average gaps from 8% to 30%, again depending on the type 
of instances. The running times (using the same test bed as in section 1) are still 
quite tolerable (about a second even for fairly large instances). 

4 Directed Cuts: A New Primal-Dual Algorithm 

The previously described heuristics had complementary advantages: The first, 
PDucj guarantees an upper bound of 2 on the ratio between the generated 
upper and lower bounds, but empirically, it does not perform much better than 
in the worst case. The second one, DAc, cannot provide such a guarantee, but 




Fig.l. Arbitrarily bad 
case for DAc 
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empirically it performs much better than the first one, especially for computing 
lower bounds. In this section we describe a new algorithm which combines both 
features. 

A straightforward application of the primal-dual method of PDjjc (simul- 
taneous increasing of all dual variables corresponding to active components and 
merging components which share a vertex) to LPq leads to an algorithm with 
performance ratio 2 and running time 0(e + nlogn), but the generated lower 
bounds are again not nearly as tight as those provided by DAq- 

The main idea for a successful new approach is not to merge the components, 
but to let them grow as long as they are (minimally) active. As a consequence, 
dual variables corresponding to several cuts which share the same arc may be 
increased simultaneously. Because of that, the reduced costs of arcs which are in 
the cuts of many active components are decreased much faster than the other 
ones and we have constructed instances where a straightforward primal-dual 
algorithm based on this approach fails to give a performance ratio of two. 

Therefore, we group all components that share a vertex together and postu- 
late that in each iteration, the total increase A of dual variables corresponding 
to each group containing at least one active component must be the same. If we 
denote the number of active components in a group P with activesInGroup{P), 
the dual variable corresponding to each of these components will be increased 
by A/ activesInGroup{P). Similar to the case of DAc, a component is called 
active if it does not contain the root or include an active component of another 
terminal (ties are broken arbitrarily) . A terminal is called active if its component 
is active; and a group is called active if it contains an active terminal (by this 
definition it is guaranteed that each active root component corresponds to one 
active terminal). If we denote with activeGroups the number of active groups, 
the lower bound lower will be increased in each iteration by A- activeGroups. 

To manage the reduced costs efficiently, a concept like that of distance es- 
timates in the algorithm of Dijkstra is used (see for example [4]). For each arc 
X, the value d(x) estimates the value of dGroup (amount of uniform increase 
of group duals, i.e. the sum of Z\- values) which would make x tight (set its 
reduced cost c{x) to zero). Because of the definition of groups, for an arc x 
with reduced cost c(x) > 0, all active components S with x G S~{S) will be 
in the same group P. If there are activesOnArc{x) such components, then d{x) 
should be c{x) • activesInGroup{P) / activesOnArc(x) + dGroup. For updating 
the d-values we use two further variables for each arc x: reducedGost(x) and 
lastReducedGostUpdate{x)] they are initially set to c{x) and 0, respectively. If 
activesOnArc{x) and/or activesInGroup{P) change, the new value for d{x) can 
be calculated by: 

d{x) := reducedGost(x) ■ activesInGroup / activesOnArCnew{x) + dGroup] 
reducedGost{x) := reducedGost(x) — {dGroup — lastReducedGostUpdate{x)) ■ 
actives OnArCoid{x) / activesin Group (T) ; 
lastReducedGostUpdate{x) := dGroup. 

Below we give a description of the algorithm PDq in pseudocode with macros 
(a call to a macro is to be simply replaced by its body). A priority queue PQ 
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manages the arcs using the d- values as keys. The groups are stored in a disjoint- 
set data structure Groups. Two lists H and H store the tight arcs and the 
corresponding edges. A Stack is used to perform depth-first searches from ver- 
tices newly added to a component. The array visited[z,v\ indicates whether the 
vertex v is in the component of the terminal z; firstSeenFrom[v] gives the first 
terminal whose component has reached the vertex v; active[z] indicates whether 
the terminal z is active; d[x] gives the d-value of the arc x; activesInGroup[r] 
stores the number of the active components in the group F ; and actives On Arc[x] 
gives the number of components which have the arc x in their cuts. 

PDc{G,R, zi) 

1 initialize PQ, Groups, FI, H; 

2 forall z G : ” initializing the components” 

3 Groups. MAKE-SET(z); activesInGroup[z] := 1; active[z] :=TRUE; 

4 forall X G S~{z) : 

5 activesOnArc[x] := 1; d[x] := c(x); PQ.INSERT{x, d[x]); 

6 forall V € V : visited[z,v] := EALSE; 

I visited[z,z] := TRUE; firstSeenErom[z] := z; 

8 forall V €V\R: firstSeenErom[v] := 0; 

9 activeGroups := r — 1; dGroup := 0; lower := 0; 

10 while activeGroups > 0 : 

II X := [vi,Vj] := PQ.EXTRAGT-MIN{); "get the next arc becoming tight" 

12 A := d[x] — dGroup; dGroup := d[x\; lower := lower + A- activeGroups; 

13 mark [vi,Vj] as tight; 

14 if (vijVj) is not in H : "i.e. [vj,Vi] is not tight" 

15 H.APPEND{{v„Vj)); H .APPEND{[vi,Vj]); 

16 Zi := firsts eenErom[vi\; Zj := firstSeenErom[vj]; 

17 if Zi = 0 : firstSeenErom[vi] := zj; 

18 else if Groups. EIND{zi) ^ Groups. EIND{zj) : MERGE-GROUPS{zi, Zj); 

19 forall active z G Ri : 

20 if visited\z , V j] and not visited\z,Vi] : EXTEND-GOMPONENT(z,Vi); 

21 H' := H; H' := H; PRUNE{H' , H'); 

22 return H' , lower, "upper: the cost of H' " 

EXTEND-GOMPONENT{z,Vi) "modified depth-first search" 

1 Stack. INITQ; Stack. PUSH{vi); 

2 while not Stack. EMPTYf) : 

3 v :=Stack.POP{); 

4 if {v = Zi) or (v G i?\ {z} and active[v]) : 

5 REMOVE-GOMPONENT(z); 

6 break; 

7 if not visited[z, v] : 

8 visited[z,v] := TRUE; 

9 forall [u,ru] G 5+(u) : 

10 if visited[z,w] : 

11 activesOnArc[[v,w]] :=activesOnArc^v,w]] — 1; 

12 update the key of[v,w] in PQ; 
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13 else : 

if [w,f] is already tight : Stack. PUSH{w); 

15 else : 

16 activesOnArc[[w,v\\ \=activesOnArc[[w,v\\ + 1] 

17 update the key of [w,f] in PQ; 

MERGE-GROUPS{zi, Zj) 

1 gi := Groups. FIND{zi)] gj := Groups. FIND{zj); 

2 if activesInGroup[gi] > 0 and activesInGroup[gj\ > 0 : 

3 update in PQ the keys of all arcs entering these groups; 

4 activeGroups := activeGroups—1; 

5 Groups. UNION{gi, gj); gnew ■=Groups.Find{gi); 

6 activesInGroup[gnew] ■= activesInGroup[gi] + activesInGroup[gj]; 

REMO VE- GOMPONENT{z) 

1 active[z] :=FALSE; g :=Groups.FIND{z); 

2 update in PQ the keys of all arcs entering g or the component of z; 

3 activesInGroup[g] := activesInGroup[g] — 1; 

4 if activesInGroup[g] = 0 : activeGroups := activeGroups —1; 

PRUNE{H',H') 

1 forall [vi,Vj] in H', in reverse order : 

2 if H' without (vi,Vj) connects all terminals : 

3 H4DELETE{{vi,Vj)); H' .DELETE{[vi,Vj]); 

In PDc, all initializations in lines 1-9 need 0{rn + alogn) time. The loop 
in the lines 10-20 is repeated at most a times, because in each iteration an 
arc becomes tight and there will be no active terminal (or group) when all 
arcs are tight. Over all iterations, line 11 needs O(alogn) time and lines 12-20 
excluding the macros 0{ar) time. Each execution of MERGE-GROUPS needs 
0{a log n) time and there can be at most r— 1 such executions; the same is true for 
REMOVE-GOMPONENT. For each terminal, the adjacency list of each vertex 
is considered only once during all executions of EXTEND-GOMPONENT, so 
each arc is considered (and its key is updated in PQ) at most twice for each 
terminal, leading to a total time of 0{ralogn) for all executions of EXTEND- 
GOMPONENT. So the lines 1-20 can be executed in 0{ralogn) time. 

It is easy to prove that the reverse order deletion in PRUNE can be performed 
efficiently by the following procedure: Gonsider a graph H with the edge set H 
in which the weight of each edge e is the position of the corresponding edge in 
the list PI . The edge set of a minimum spanning tree for H after pruning it until 
it has only terminals as leaves is El' . Since the edges of H are already available 
in a sorted list, the minimum spanning tree can be computed in 0(e a(e,n)) 
time. This leads to a total time of 0(ra log n) for PDc. 

Below we show that the ratio between the upper bound upper and the lower 
bound lower generated by PDc is at most 2. 

Let T be (the arcs of) the directed tree obtained by rooting El' at zi. For 
each component S, we denote with activesInGroupOf{S) the total number of 
active components in the group of S. 
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Lemma 1. At the beginning of each iteration in the algorithm PDq, it holds: 



E 

S active 



Itf'n 5-{S)\ 
activesin Group Of (S) 



<(2 



) • activeGroups. 

r — 1 



Proof. Several invariants are valid at the beginning of each iteration in PDc'- 

(1) All vertices in a group are connected by the edges currently in H . 

(2) For each active group P, at most one arc of d~{P) will belong to T, since 
all but one of the edges in 5(A) n will be removed by PRUNE because of 
(1). So T will still be a tree if for each active group P, all arcs which begin 
and end in P are contracted. 

(3) For each group P and each active component S' C A, no arc [vi,Vj] G 5“(S) 
with Vi, Vj G A will be in H', since it is not yet in H (otherwise it would not 
be in 5“(S)) and if it is added to H later, it will be removed by PRUNE 
because of (1). 

( 4 ) For each active group A and each arc [vi,Vj] G T n 5+ (A), there is at least 
one active terminal in the subtree Tj of T with the root Vj. Otherwise 
{vi, Vj) would be removed by PRUNE, because all terminals in Tj are already 
connected to the root by edges in H. 

(5) Because of (2), (4) and since at least one arc in T leaves zi, it holds: 

Sr active group 1^ (^)l ^ 1 + Sr active group |Tn5+(A)|. 

(6) Because of (3), for each active group A holds: 

Sscr s active S activesInGroup{P) ■ \H' n 5“(A)|. 

We split H' into ff' n T and H'\T. Because H' and T differ only in the 
direction of some arcs, H'\T is just T\H' with reversed arcs. Now we have: 



E 



IJf'n 5-{S)\ 



\H'n6-{S)\ 



= y y 

activesInGroupOf(S) ^ ^ activesInGrouplP') 

s active r active group g active, scr 



because of (6) 
< 



because of (5) 
< 



^ \H'nS-{P)\ 

r active group 

y |iL'nTn5-(A)| + |(r\fl-')n5+(A)| 
r active group 

y |ff'nrn5-(A)| + |Tn5-(A)| I - 1 
\r active group / 



because of (2) 

< 2 • activeGroups — 1 . 



Because activeGroups < r — 1 this proves the lemma. 



Theorem 1. Let upper and lower be the bounds generated by PDc- It holds 
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Proof. Let Ai be the value of A in the iteration i. For each directed Steiner cut 
{S, S), let ys be the value of the corresponding dual variable as (implicitly) cal- 
culated by PDc (in iteration i each dual variable ys corresponding to an active 
component S is increased by Ai/ activesInGroupOf{S)). Since all arcs of H' have 
zero reduced costs, we have: upper = J2 x(^h' ^(a;) = J2x(^h' J2s x(^s-(S) Vs = 
Es\H' n S (S')! • ys- This value is zero at the beginning and is increased by 
Ss active ' Ai / activesInGroupOf{S) in the iteration i. By Lemma 

1, this increase is at most (2 — • activeGroups ■ Ai. Since lower is zero at 

the beginning and is increased exactly by activeGroups ■ Ai in the iteration i, 
we have upper < (2 — y^) • lower after the last iteration. □ 

We found examples which show that the given approximation ratio is tight for 
the upper bound as well as for the lower bound. 

The discussion above assumes exact real arithmetic. Even if we adopt the 
(usual) assumption that all numbers in the input are integers, using exact arith- 
metic could deteriorate the worst case running time due to the growing de- 
nominators. But if we allow a deterioration of e (for a small constant e) in the 
approximation ratio, we can solve this problem by an appropriate fixed-point 
representation of all numbers. 

Empirically, this algorithm behaves similarly to DAq. The lower bounds are 
again fairly tight, with average gaps from a fraction of a percent to about 2%, 
depending on the type of instances. The upper bounds, although more stable 
than those of DAc, are not good; the average gaps are about 8%. The running 
times (using the same test bed as in section 1) are, depending on the type of 
instances, sometimes better and sometimes worse than those of DAc] altogether 
they are still tolerable (several seconds for large and dense graphs) . 

5 Concluding Remarks 

In this article, we have studied some LP-duality based algorithms for comput- 
ing lower and upper bounds for the Steiner problem in networks. Among other 
things, we have shown that none of the known algorithms both generates tight 
lower bound empirically and guarantees their quality theoretically; and we have 
presented a new algorithm which combines both features. 

One major point remains to be improved: The approximation ratio of 2. 
Assuming that the integrality gap of the directed cut relaxation is well below 

2, an obvious desire is to develop algorithms based on it with a better worst 
case ratio between the upper and lower bounds (thus proving the assumption). 
There are two major approaches for devising approximation algorithms based 
on linear programming relaxations: LP-rounding and primal-dual schema. A 
discussion in [20] indicates that no better guarantee can be obtained using a 
standard LP-rounding approach based on this relaxation. The discussion in this 
paper indicates the same for a standard primal-dual approach. Thus, to get a 
better ratio, extensions of the primal-dual schema will be needed. Two such 
extensions are used in [20], where a ratio of 3/2 is proven for the special class of 
quasi-bipartite graphs. 
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Abstract. We investigate the problem of broadcasting information in a 
given undirected network. At the beginning information is given at some 
processors, called sources. Within each time unit step every informed 
processor can inform only one neighboring processor. The broadcasting 
problem is to determine the length of the shortest broadcasting schedule 
for a network, called the broadcasting time of the network. 

We show that there is no efficient approximation algorithm for the broad- 
casting time of a network with a single source unless V = AfV. More 
formally, it is A/'T’-hard to distinguish between graphs G = (V, E) with 
broadcasting time smaller than b £ 6>(-\/|U|) and larger than (|g — e)b 
for any e > 0. 

For ternary graphs it is A/'T’-hard to decide whether the broadcasting 
time is 6 £ 6>(log|U|) or fe -I- 0{\/b) in the case of multiples sources. 
For ternary networks with single sources, it is MV-haxd to di stin guish 
between graphs with broadcasting time smaller than b £ 0{\/\V\) and 
larger than b + C\/\og b. 

We prove these statements by polynomial time reductions from E3-SAT. 
Classification: Computational complexity, inapproximability, network 
communication . 



1 Introduction 

Broadcasting reflects the sequential and parallel aspects of disseminating infor- 
mation in a network. At the beginning the information is available only at some 
sources. The goal is to inform all nodes of the given network. Every node may 
inform another neighboring node after a certain switching time. Along the edges 
there may be a delay, too. Throughout this abstract the switching time is one 
time unit and edges do not delay information. This model is called Telephone 
model and represents the broadcasting model in its original setting [GaJo79]. 

The restriction of the broadcasting problem to only one information source vq 
has often been considered, here called single source broadcasting problem (SB). 
Note that the broadcasting time b(G,Vo) is at least log 2 \V\ for a graph G = 
(V,E), since during each round the number of informed vertices can at most 
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double. The smallest graph providing this lower bound is a hinomial tree F„ 
[HHL88]: Fq consists of a single node and consists of disjunct subtrees 

Fq, ... ,F„ whose roots tq, ... ,r„ are connected to the new root r„+i. Also the 
hyper-cube C„ = {{0, 1}"}, {{tcOw, tclu} \ w,v € {0,1}*} has this minimum 
broadcasting time since binomial trees can be derived by deleting edges. 

The upper bound on b{G) is |h^| — 1, which is needed for the chain graph 
representing maximum sequential delay (Fig. 1) and the star graph (Fig. 2) 
producing maximum parallel delay. The topology of the processor network highly 
influences the broadcasting time and much effort was given to the question how 
to design networks optimized for broadcasting, see [LP88,BHLP92,HHL88]. 

Throughout this paper the communication network and the information 
sources are given and the task is to find an efficient broadcasting schedule. The 
original problem deals with single sources and its decision problem, called SBD, 
to decide whether the broadcasting time is less or equal a given deadline Tq, 
is AfT^-complete [GaJo79,SCH81]. Slater et al. also show, for the special case 
of trees, that a divide-and-conquer strategy leads to a linear time algorithm. 
This result can be generalized for graphs with a small tree-width according to a 
tree decomposition of the edges [JRS98]. However, SBD remains AfT^-complete 
even for the restricted case of ternary planar graphs or ternary graphs with 
logarithmic depth [JRS98] . 

Bar-Noy et al. [BGNS98] present a polynomial-time approximation algorithm 
for the single source broadcasting problem (SB) with an approximation factor of 
0(log |P|) for a graph G = {V, E). SB is approximable within 0{ jpg iog^|/| ) if fh® 
graph has bounded tree-width with respect to the standard tree decomposition 
[MRSR95]. 

Adding more information sources leads to the multiple source broadcasting 
problem (MB). It is known to be NP-complete even for constant broadcasting 
time, like 3 [JRS98] or 2 [Midd93] . This paper solves the open problem whether 
there are graphs that have a non-constant gap between the broadcasting time 
6(G) and a polynomial time computable upper bound. In [BGNS98] this question 
was solved for the more general multicast model proving an inapproximability 
factor bound of 3 — e for any e > 0. In this model switching time and edge 
delay may differ for each node and instead of the whole network a specified 
sub-network has to be informed. 

It was an open problem whether this lower bound could be transfered to the 
Telephone model. In this paper, we solve this problem using a polynomial time 
reduction from E3-SAT to SB. The essential idea makes use of the high degree 
of the reduction graph’s source. A good broadcasting strategy has to make most 
of its choices there and we show that this is equivalent to assigning variables of 
an E3-GNF-formula. A careful book-keeping of the broadcasting times of certain 
nodes representing literals and clauses gives the lower bound of — e. 

We show for ternary graphs and multiple sources that graphs with a broad- 
casting time 6 € G(log |y|) cannot be distinguished from those with broadcast- 
ing time 6 -I- cVb for some constant c. This result implies that it is MV-h-ord to 
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distinguish between ternary graphs with the single source broadcasting time of 
b e 0 (a/|F|) and graphs with broadcasting time b+ c\/log b. 

The paper is organized as follows. In Section 2 formal notations are intro- 
duced, in the next section the general lower bound of SB is proved. We present in 
section 4 lower bounds for the ternary case. Section 5 concludes and summarizes 
these results. 

2 Notation 

Edges of the given undirected graph may be directed to indicate the information 
flow along an edge. 

Definition 1. Let G = (V,E) be an undirected graph with a set of vertices 
Vo C V, called the sources. The task is to compute the broadcasting time 
6(G, Vo), the minimum length T of a broadcast schedule S. This is a sequence 
of sets of directed edges S = {E\,E 2 , . . . , Et-i, Et). Their nodes are in the sets 
Vo,V\, . . . ,Vt = V, where for i > 0 we define Vi := Vi-i U {v \ (u,v) G 
Ei and u G V-i }. A broadcast schedule S fulfills the properties 

1. EiQ { {u, ri) I u G Vi-i, {u, v} G E } and 

2. Vu G Vi-i : \E,C\ ({m} x E)| < 1 . 

The set of nodes Vi has received the broadcast information by round i. For 
an optimal schedule with length T, the set Vt is the first to include all nodes of 
the network. Ei is the set of edges used for sending information at round i. Each 
processor u G 14-1 can use at most one of its outgoing edges in every round. 

Definition 2. Let S be a broadcast schedule for (G, Vb), where G = (V,E). The 
broadcasting time of a node v G V is defined as bs{v) = min{z | v G Vi}. 
A broadcast schedule S is called busy if the following holds. 

1. y{v,w} G E : bs{w) > bs{v) + I 3w' G V : (z;, w') G 

VuGE\{uo} : n(ExM)| = l. 

In a busy broadcasting schedule, every processor tries to inform a neighbor 
in every step starting from the moment it is informed. When this fails it stops. 
By this time, all its neighbors are informed. Furthermore, every node is informed 
only once. Every schedule can be transformed into a busy schedule within poly- 
nomial time without increasing the broadcasting time of any node. From now on, 
every schedule is considered to be busy. In [BGNS98] this argument is generalized 
(the authors call busy schedules not lazy). 

A chain is defined by C„ = ({ui, . . . ,u„}, {{wi, ziz+i}}) (Fig. 1), and a star 
by Sn = ({ui,...,u„},{{z;i,wj I i > 1}) (Fig. 2). 

Fact 1. There is only one busy broadcast strategy that informs a chain with k 
interior nodes. Let its ends v,w he informed in time b^ — k < by < b^. Then 
the chain is informed in time \{by + by, + k)/2] assuming that the ends have no 
obligations for informing other nodes. 

There are n! busy broadcast schedules for the star Sn that describe all per- 
mutations o/{l , ... ,n} by {bs{v\), . . . ,bs(vn)). 
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3 The General Lower Bound 

This section presents a polynomial time reduction from E3-SAT to SB and the 
proof of the constant inapproximability factor. E3-SAT denotes the satisfiability 
problem of Boolean CNF-formulas with exactly three literals in each clause. 

Theorem 1 [Hdst97]. For any e > 0 it is NV-hard to distinguish satisfiable 
E3-SAT formulas from E3-SAT formulas for which only a fraction 7 /8 + e of the 
clauses can he satisfied, unless V = NV . 

Let F be a 3-CNF with m clauses ci, . . . , Cm and variables x\, ... ,Xn- Let 
a{i) denote the number of occurrences of the positive literal Xi in F. It is possible 
to assume that every variable occurs as often positive as negated in F, since in 
the proof of Theorem 1 this property is fulfilled. Let <5 := 2£m', where m' := 
X)r=i ®(*) with £ being a large number to be chosen later on. Note that m = ^rn' . 

The formula F is reduced to an undirected graph Gp,i (see Fig. 5). The 
source vq and its d neighbors form a star Ss {b G {0, 1}, i G {!,..■ ,n}, 

j G {1, . . . , a(i)}, k G t'}). We call the nodes x^ j j. literal nodes. They 

belong to £ disjunct isomorphic subgraphs Gi, . . . , Gg. A subgraph Gk contains 
literal nodes x\ ^ representing the literal x’l {x\ = Xi, x^ = xf)- 

As a basic tool for the construction of a sub-graph Gk, a chain Gp{v,w) is 
used starting at nodes v and ending at w with p interior nodes that are not 
incident to any other edge of the graph. Between the literal nodes corresponding 
with a variable Xi in Gk we insert chains Gs{x^ ^ j' fc) ^ . . . , n} 

and j,j' G a(i)}. 

For every clause Cy = x\^^ V x\'^ V x\^ we insert clause nodes Cy^k which we 
connected via three chains Gs/ 2 {cy^k,x\'’^ k) P ^ {1)2,3} of length 6/2 to 
their corresponding literal nodes x\/ j 3 fc- This way every literal 

node is connected to one clause node. This completes the construction of Gk. 

The main idea of the construction is that the assignment of a variable xi 
indicates when the corresponding literal nodes have to be informed. 
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Lemma 1. If F is satisfiable, then b{GF,i,vo) < <5 + 2m' + 2. 

Proof: The busy schedule S informs all literal nodes directly by vq. Let ai, ... ,a„ 
be a satisfying assignment of F. The literal nodes xft ^ of graph Gk are informed 
within the time period (k—l)m'+l, . . . , km' . The literal nodes x“* j. are informed 
within the time period 5 — km' + 1, . . . , (5 — (fc — \)m' . 

Note that m' is a trivial upper bound for the degree at a literal node. So, the 
chains between two literal nodes can be informed in time 5 + 2m' + 1 . A clause 
node can be informed in time km' + i5/2 + 1 by an assigned literal node of the 
first type, which always exists since «i, . . . , satisfies F. Note that all literal 
nodes corresponding to the second type are informed within 5 — (k — l)m' . So 
the chains between those and the clause node are informed in time S + 2m' + 2. 

I 



Lemma 2. Let S be a busy broadcasting schedule for Gp,i- Then, 



1. every literal node will be informed directly from the source vq, and 



2 . 



for Cvk = ■ k V ■ k V ■ k ■ 

J *lOl,K *2j2.fc *303, K 



bs{Ck,k) > i +minp{6s(x“'’_^.^ ^^)}. 



Proof: 



1. Every path between two literal nodes that avoids vq has at least length (5+1. 
By Fact 1 even the first informed literal node has no way to inform any other 
literal node before time point (5, which is the last time a literal node is going 
to be informed by vg. 

2. follows by 1. 

I 

If only one clause per Boolean formula is not satisfied, this lemma implies 
that if F is not satisfiable, then b{Gp,e, {^’o}) > 5 + £. A. better bound can be 
achieved if the inapproximability result of Theorem 1 is applied. A busy schedule 
S for graph Gp,i defines an assignment for F . Then, we categorize every literal 
as high, low or neutral, depending on the consistency of the time of information. 
Clause nodes are classified either as high or neutral. Every unsatisfied clause of 
the E3-SAT-formula F will increase the number of high literals. Besides this, 
high and low literal nodes come in pairs, yet possibly in different subgraphs Gk 
and Gk' . The overall number of the high nodes will be larger than those of the 
low nodes. 



Theorem 2. For every e > 0 there exist graphs G = (V,E) with broadcasting 
time at most b € 0{^J\V\) such that it is NV-hard to distinguish those from 
graphs with broadcasting time at least {% — T)b. 

Proof: Consider an unsatisfiable E3-SAT-formula F, the above described graph 
Gp,e and a busy broadcasting schedule S on it. The schedule defines for each 
subgraph Gk an assignment xi^k, ■ ■ ■ ,Xn,k G {0, 1}” as follows. Assign the vari- 
able Xi^k = a ii the number of delayed literal nodes with bs{xf^ff) > 5/2 is 
smaller than those with bs{xfj > 6/2. If both numbers are equal, w.l.o.g. let 
Xi,k — d- 
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• delayed literal-node 
o early literal-node 
□ clause-node 
+ high node 
- low node 



Assignment 

10 10 0 1 




+ 



Fig. 5. The reduction graph Gf,i- 



Fig. 6. High and low lit- 
eral nodes. 



1. A literal node cf^ ^ is coherently assigned, iff bs{c ^ ^ = Oi. 

All coherently assigned literal nodes are neutral. 

2. A literal node j. is high if it is not coherently assigned and delayed, i.e. 
Xi^k = a and hs{x^j k) > 

3. A literal node xfj is low if it is not coherently assigned and not delayed, i.e. 
Xi^k = a and bs{xf j k) < <^/2- 

4. A clause node c,^^k is high, if all its three connected literal nodes are coherent 
and delayed, i.e. Vp e {1,2,3} bs{x“‘j^ /.) > 5/2. 

5. All other clause nodes are neutral. 

Every high literal node with broadcasting time 5/2-1- ei for ei > 0 can be matched 
to a neutral delayed literal node xfj, ^ with broadcasting time bs{xfj, = 
5/2 + 62 for £2 > 0. Fact 1 shows that the chain between both of them can be 
informed in time 5 + at the earliest. 

For a high clause node with literal nodes x// ^ and broadcasting times 

jp k) — '^/2 + with eiA 2 ,e 3 > 0, Lemma 2 shows that this high 
clause node gets the information not earlier than 5 + minjei, £ 2 , £3}. So, the 
chain to the most delayed literal node will be informed at 5 -I- (minjci, £ 2 , £3} -I- 
max{£i, £ 2 , £ 3})/2 at the earliest. 

Lemma 3. Let q be the number of low literal nodes, p the number of high literal 
nodes, and p' the number of high clause nodes. Then the following holds: 

1 . p= q, 

2. bs{GF,i,vo) > 5 + p, 

3. bs{GF,i,vo) > 5+{p + 5p')/2. 
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Proof: 

1. Consider the set of nodes xf- for j G {1, . . . , a(i)} and a G {0, 1}. For this 
set let pi^k be the number of high nodes, the number of low nodes and 
ri^k the number of nodes with time greater than 5/2. By the definition of 
high and low nodes the following holds for alH G n}, fc G 



ri,k - Pi,k + Qi,k = a{i) . 



Fact 1 and Lemma 2 show that half of the literal nodes are informed within 
5/2 and the rest later on: 

'^ n,k = 5/2 = ^a(i) , 



It then it follows that: 

g - P = ^ Ti.k - Pi,k + Qi,k - a(i) = 0 . 

i,k 

2. Note that we can match each of the p high (delayed) literal node ^ to a 
coherent delayed literal node xf^, Furthermore, these nodes have to inform 
a chain of length 5. If the latest of the high nodes and its partners is informed 
at time 5/2 + e, then Fact 1 shows that the chain cannot be informed earlier 
than 5 + e/2. 

The broadcasting time of all literal nodes is different. Therefore it holds 
e > 2p, proving bs{Gp,i, vq) > 5 + P- 

3. Every high clause node is connected to three neutral delayed literal nodes. 

The task to inform all chains to the three literal nodes is done at time 5+e' /2 
at the earliest, if 5/2 + d is the broadcasting time of the latest literal node. 
For p' high clause nodes, there are Zp' corresponding delayed neutral literal 
nodes. Furthermore, there are p delayed high literal nodes (whose matched 
partners may intersect with the 3p' neutral literal nodes). Nevertheless, the 
latest high literal node with broadcasting time 5/2 + e" causes a broadcast 
time on the chain to a neutral delayed literal node of at least 5 + d' /2. 
From both groups consider the most delayed literal node r^max- Since every 
literal node has a different broadcasting time it holds that d' > 3p' + p, and 
thus &s(Vmax) >5+ (3p' + p)/2. I 

Suppose all clauses are satisfiable. Then Lemma 1 gives an upper bound for 
the optimal broadcasting time of b{GF,i,vo) < <5 + 2m' + 2. 

Let us assume that at least nm of the m clauses are unsatisfied for every 
assignment. Consider a clause node that represents an unsatisfied clause with 
respect to the assignment which is induced by the broadcast schedule. Then at 
least one of the following cases can be observed: 

— The clause node is high, i.e. its three literal nodes are coherently assigned. 

— The clause node is neutral and one of its three literal nodes is low. 

— The clause node is neutral and one of its three literal nodes is high. 
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Since each literal node is chained to one clause node only, this implies 
Kim < p' + p + q = p' + 2p . 

The case p > 3p' implies p> j{2p + p'). Then it holds for the broadcasting time 
of any busy schedule S: 

bs{GF,e,vo) > S+p > 6+^p' + 2p). 

Otherwise, if p < 3p', then |(p + 3p') > |(2p + p') and 

bs{GF,e,vo) > <5+i(p+3p') > S+^p' + 2p). 

Note that S = 3m£. Combining both cases, it follows that 

bs{GF,e,vo) > S+jKim = 5 (l + . 

For any e > 0 this gives, choosing i € 0{m) for sufficient large m 



bs{GF,e, Vo) 



> 



1+ hn 



b{GF,t,vo) l + e- 

Theorem 1 states k = ^ — e” for any e” > 0 which implies claimed lower bound 
of II — e for any e > 0. Note that the number of nodes of Gf,i is in 0(m^) and 

6 G 0{m^). I 



4 Inapproximability Results for Ternary Graphs 

The previous reduction used graphs Gf,i with a large degree at the source node. 
To address ternary graphs with multiple sources we modify this reduction as 
follows. 

The proof uses a reduction from the E3-SAT-6 problem: a CNF formula with 
n variables and m = n/2 clauses is given. Every clause contains exactly three 
literals and every variable appears three times positive and three times negative, 
but does not appear in a clause more than once. The output is the maximum 
number of clauses that can be satisfied simultaneously by some assignment to 
the variables. 

Lemma 4. For some e > 0, it is NV-hard to distinguish between satisfiable 
3CNF-6 formulas, and 3CNF-6 formulas in which at most o (1 — e)-fraction of 
the clauses can be satisfied simultaneously. 

Proof: Similar as Proposition 2.1.2 in [Feig98]. Here, every second occurrence of 
a variable is replaced with a fresh variable when reducing from E3-SAT. This 
way the number of positive and negative literals remains equally high. I 

How can the star at the source be replaced by a ternary sub-graph that 
produces high differences between the broadcasting times of the literal nodes? 
It turns out that a good way to generate such differences in a very symmetric 
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setting is a complete binary tree. Using trees instead of a star complicates the 
situation. A busy broadcasting schedule informs (^) leaves in time d + t where 
in the star graph only one was informed in time t. This is the reason for the 
dramatic decrease of the inapproximability bound. 

The ternary reduction graph G'p^, given a 3CNF-6- formula F and a number 
£ to be chosen later, consists of the following sub-graphs (see Fig. 7). 

1 . The sources are roots of complete binary trees B\, . . . , B„ with 

depth 6 = log( 12 £) and leaves . . . , £ will be chosen such that 5 is an 

even number. 

A constant fraction of the leaves of Bi are the literal nodes xfj ^ of a subgraph 
Gfc. The rest of them, yfj is connected in pairs via i5-chains. For an accurate 
description we introduce the following definitions. 



vi V 2 v„ 




Fig. 7. The reduction graph G'p p 



Let UM := E '= 5 / 2+1 (f)- Since ^ and ( 5 / 2 + 75 ) ^ i* Isolds 

forp G {!,... ,v^}: < fs{p) < P^- For gd{x) := min{p | f{p,d) > a;} this 

implies for x G [0, -j^]: x^ < gs{x) < lOx^. Note that fs and gs are monotone 
increasing. 

Every node of Bi is labeled by a binary string. If r is the root, label(r) is 
the empty string A. The two successing nodes vi,V 2 of a node w are labeled by 
label(i(7)0 and label(u;)l Two leaves x,y are called opposite if label(a:) can 
be derived from label(?/) by negating every bit. For a binary string let A(s) := 
l#i('S) ~ #o(s)| be the difference of occurrences of 1 and 0 in s. Consider an 
indexing of the leaves of Bi such that for all j G {1, . . . , 2*^ — 1} : 

A(label(u*)) < A(label(u*+ 2 )), and v* and have opposite labels for all 

jG 2 ^. 
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2. For every binary tree Bi according to these indices the literal nodes of Gk 

are defined by ^ and for 

j G 3}, and fc G 

3. The other leaves of Bi are connected pairwise by chains of length <5 such 
that opposite leaves of a tree represent free literal nodes and j . These 
nodes are not part of any sub-graph Gk- 

4. The sub-graphs Gfc for fc G {1, . . . , A:} described in the previous section have 
a degree 5 at the literal nodes. These nodes are replaced with rings of size 5 
to achieve degree 3 (see Fig. 4). 



Theorem 3. It is NV — hard to distinguish ternary graphs G = (V, E) with mul- 
tiple sources and broadcasting time b G 0(log|fo|) from those with broadcasting 
time b cVb for any constant c. 



Proof Sketch: If F is satisfiable, then there is a coherently assigning broadcast 
schedule with b{G'pj) < 2S 4. 

An analogous observation to Lemma 2 for a busy broadcasting schedule S 
for G'pg is the following. 



Every literal node will be informed directly from the source of its tree; 

For alH G {1, . . . , n} and for alH G {0, . . . , 5} it holds 
\{jG{l,...,2^}\bs{Ej)=t + 6}\ = 

Forc,,fc = bs{cpk) = | + minp{&s(x“T^ ^)} + 0(l). 

Again literal nodes are defined to be either low, high, or neutral. Clause nodes 
are either high or neutral. For the number q of low literals, p of high literals, 
and p' the number of high clauses it holds p = q. There are 2p, resp. 2>p' nodes 
in different chains that are informed later than 2i5 — 1. Therefore there is a tree 
Bk that is involved in the delayed information of 2p/n, resp. ip' jn nodes. Using 
gs it is possible to describe a lower bound of the time delay caused by Bk as 
follows. 



bs{GF,i) > 



26 — 1 -I - max 



2p 

gs — 



'ip' 

,gs\ — 



Let us assume that at least nm clauses are unsatisfied for every assignment. 
The constant fraction of y-leaves of trees Ti can be seen as an additional set of 
unused literal nodes. Now consider a clause node that represents an unsatisfied 
clause with respect to the assignment which is induced by the broadcast schedule. 
Then there is at least a high clause node, a neutral clause node connected to a 
low literal node, or a neutral clause node connected to a high literal node. 

Since each literal node is chained to at most one clause node, this implies 



Kim < p p q = p -\-2p . 
Note that 24i > 2^ . The observations above now imply 

bs{G'p^t-,{vi,. . . ,Vn}) > 2(5 - 1 -h 



2n 



>2(5-1- eVS 
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for some e > 0. Since for the set of nodes V of it holds \V\ G 0{£mlog£) it 
is sufficient to choose £ as a non constant polynomial of m. I 

Theorem 4. It is AfV — hard to distinguish ternary graphs G = (V,E) with 
single sources and broadcasting time b G 0{\/\V\) from those with broadcasting 
time b + c^/log b for some constant c. 

Proof: We start to combine the reduction graph of the preceding theorem with a 
ternary pyramid (see Fig 3). The single source vq is the top of the pyramid. The 
n leaves have been previously the sources. Note that the additional amount of 
broadcasting time in a pyramid is 2n for n — 1 nodes and 2n — 1 for one node for 
any busy broadcasting schedule. Thus, the former sources are informed nearly 
at the same time. 

For the choice £ G the number of nodes of the new graph is bounded 

by 0{m^). The broadcasting time increases from G(logm) of G^^ to 0{m) and 
the indistinguishable difference remains 0{\/log m). I 



5 Conclusions 

The complexity of broadcasting time is a key for understanding the obstacles 
to efficient communication in networks. This article answers the open question 
stated most recently in [BGNS98], whether single source broadcasting in the 
Telephone model can be approximated within any constant factor. Until now, the 
best upper bound approximation ratio for broadcasting time is known 0(log |U|) 
[BGNS98] and the lower bound was known as one additive time unit. Thus, a 
lower constant bound of a factor of — e is a step forward. Yet there is room 
for improvement. 

It is possible to transfer this result to bounded degree graphs. But the recon- 
struction of sub-graphs with large degree decrease the lower bound dramatically. 
Nevertheless, this paper improves on the inapproximability ratio in the single 

source case up to 1 -I- G ’ instead of 1 -I- 1/G(a/|U|) known so far 

[JRS98] . The upper bound for approximating the broadcasting time of a ternary 
graph is a constant factor. So matching upper and lower bounds remain un- 
known. 

From a practical point of view, network structures are often uncertain because 
of dynamic and unpredictable changes. And if the network is static, it is hardly 
ever possible to determine the ratio between switching time on a single processor 
and the delay on communication links. But if these parameters are known for 
every processor and communication link it turns out that an inapproximability 
factor 3 — e applies [BGNS98]. For the simplest timing model, the Telephone 
model, this paper shows that developing a good broadcasting strategy is also a 
computationally infeasible task. 
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Abstract. We consider variants of the classic bin packing and multiple 
knapsack problems, in which sets of items of different classes (colors) 
need to be placed in bins; the items may have different sizes and values. 
Each bin has a limited capacity, and a bound on the number of distinct 
classes of items it can hold. In the class-constrained multiple knapsack 
(CCMK) problem, our goal is to maximize the total value of packed 
items, whereas in the class- constrained bin-packing (CCBP), we seek to 
minimize the number of (identical) bins, needed for packing all the items. 
We give a polynomial time approximation scheme (PTAS) for CCMK 
and a dual PTAS for CCBP. We also show that the 0-1 class-constrained 
knapsack admits a fully polynomial time approximation scheme, even 
when the number of distinct colors of items depends on the input size. 
Finally, we introduce the generalized class-constrained packing problem 
{GCCP), where each item may have more than one color. We show that 
GCCP is APX-hard, already for the case of a single knapsack, where all 
items have the same size and the same value. 

Our optimization problems have several important applications, includ- 
ing storage management for multimedia systems, production planning, 
and multiprocessor scheduling. 



1 Introduction 

In the well-known bin packing (BP) and multiple knapsack (MK) problems, a 
set, /, of items of different sizes and values has to be packed into bins of limited 
capacities; a packing is legal if the total size of the items placed in a bin does 
not exceed its capacity. We consider the following class-constrained variants of 
these problems. Suppose that each item has a size, a value, and a class (color); 
each bin has limited capacity, and a limited number of compartments. Items of 
different classes cannot be placed in the same compartment. Thus, the number of 
compartments in each bin bounds the number of distinct classes of items it can 
accommodate. A packing is legal if it satisfies the traditional capacity constraint, 
as well as the class constraint. 

Formally, the input to our packing problems is a universe, I, of size |/| = n. 
Each item u G I has a size s(u) G and a value p(u) G Each item in I 
is colored with one of M distinct colors. Thus, / = /i U /2 • • • U Im, where any 

* Author supported in part by Technion V.P.R. Fund - Smoler Research Fund, and 
by the Fund for the Promotion of Research at the Technion. 

K. Jansen and S. Khuller (Eds.): APPROX 2000, LNCS 1913, pp. 238—249, 2000. 
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item u € li is colored with color i. The items need to be placed in bins, where 
each bin j,j = 1, 2, . . has volume Vj and Cj compartments. 

The output of our packing problems is a placement, which specifies for each 
bin j which items from each class are placed in j (and accordingly, the colors 
to which j allocates compartments). A placement is legal if for all j > 1, bin j 
allocates at most Cj compartments, and the overall size of the items placed in j 
does not exceed Vj. We study two optimization problems: 

The Class- Constrained Multiple Knapsack Problem (CCMK), in which 
there are N bins (to which we refer as knapsacks). A placement determines a 
subset S' = S'! U 5'2 • • • U Sm of I, such that Si C /j is the subset of packed items 
of color i. Our goal is to find a legal placement which maximizes the total value 
of the packed items, given by P('*^); 

The Class- Constrained Bin- Packing Problem (CCBP), in which the bins 
are identical, each having size 1 and C compartments, and all the items have 
size s{u) < 1. Our goal is to find a legal placement of all the items in a minimal 
number of bins. 

We also consider a generalized version of class-constrained packing (GCCP), 
where each item, u, has a size s(u), a value p{u), and a set c(u) of colors, such 
that it is legal to color u in any color in c{u). Thus, if some knapsack allocates 
compartments to the set ci of colors, then any item u G I added to this knapsack 
needs to satisfy c{u) n ci yf 0. (In the above CCMK and CCBP problems, we 
assume that Vu G I, |c(u)| = 1). 



1.1 Motivation 

Storage Management in Multimedia Systems: The CCMK problem is motivated 
by a fundamental problem in storage management for multimedia-on-demand 
(MOD) systems (see, e.g.,[26]). In a MOD system, a large database of M video 
program files is kept on a centralized server. Each program file, i, is associated 
with a popularity parameter, given by qi G [0, 1], where 9* = 1- The files 

are stored on N shared disks. Each of the disks is characterized by (i) its storage 
capacity, that is, the number of files that can reside on it, and (ii) its load capac- 
ity, given by the number of data streams that can be read simultaneously from 
that disk. Assuming that {qi, . . . , 9 m } are known, we can predict the expected 
load generated by each of the programs at any time. 

We need to allocate to each file disk space and fraction of the load capacity, 
such that the load generated due to access requests to that file is satisfied. The 
above storage management problem can be formulated as a special case of the 
CCMK problem, in which s{u) = p{u) = 1 for all u € /: a disk j, with load 
capacity Lj and storage capacity Cj, is represented by a knapsack Kj, with 
capacity Lj and Cj compartments, for all j = 1, ..., N. The ith file, 1 < i < M, 
is represented by a set C, the size of which is proportional to the file popularity. 
Thus, n = \I\ = Tj and \Ii\ = qi\I\ A solution for the CCMK problem 
induces a legal assignment of the files to the disks: Kj allocates a compartment 
to items of color i, iff a, copy of the ith file is stored on disk j, and the number 

^ For simplicity, we assume that qi\I\ is an integer (otherwise we can use a standard 
rounding technique [13]). 
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of items of color i that are packed in Kj is equal to the total load that file i can 
generate on disk j . 

Production Planning: Our class-constrained packing problems correspond to the 
following variant of the production planning problem. Consider a set of machines, 
the jth machine has a limited capacity, Vj, of some physical resource (e.g., storage 
space, quantity of production materials). In addition, hardware specifications 
allow machine j to produce items of only Cj different types. The system receives 
orders for products of M distinct types. Each order u is associated with a demand 
for s{u) units of the physical resource, a profit, p{u), and its type i G {1, . . . , M}. 
We need to determine how the production work should be distributed among 
the machines. When the goal is to obtain maximal profit from a given set of 
machines, we have an instance of the CCMK. When we seek the minimal number 
of machines required for the completion of all orders, we have an instance of the 
CCBP. When each order can be processed under a few possible configurations, 
we get an instance of GCCP. 

Scheduling P arallelizahle Tasks: Consider the problem of scheduling paralleliz- 
able tasks on a multiprocessor, i.e., each task can run simultaneously on several 
machines, (see, e.g., [24]). Suppose that we are given N parallelizable tasks, to 
be scheduled on M uniform machines. The ith machine, 1 < i < M, runs at a 
specific rate, li. Each task Tj requires Vj processing units and can split to run 
(simultaneously) on at most Cj machines. Our objective is to find a schedule 
that maximizes the total work done in a given time interval. This problem can 
be formulated as an instance of the CCMK, in which a task Tj is represented by 
a knapsack Kj with capacity Vj and Cj compartments; a machine with rate li 
is represented by li items, such that s{u) = p{u) = 1, for all u G I. 

1.2 Related Work and Our Results 

There is a wide literature on the bin packing and the multiple knapsack problems 
(see, e.g., [15,2,12,7,3] and detailed surveys in [19,4,18]). Since these problems are 
NP-hard, most of the research work in this area focused on finding approximation 
algorithms. The special case of MK where iV = 1, known as the classic 0-1 knap- 
sack problem, admits a fully polynomial time approximation scheme (FPTAS). 
That is, for any e: > 0, a (1 — e)-approximation for the optimal solution can be 
found in 0{n/e^), where n is the number of items [14,9]. In contrast, MK was 
shown to be NP-hard in the strong sense [8], therefore it is unlikely to have an 
FPTAS, unless P = NP. It was unknown until recently, whether MK possessed 
a polynomial time approximation scheme (PTAS), whose running time is poly- 
nomial in n, but can be exponential in i. Chekuri and Khanna [3] resolved this 
question: They presented an elaborated PTAS for MK, and showed that with 
slight generalizations this problem becomes APX-Hard. Independently, Kellerer 
developed in [17] a PTAS for the special case of the MK, where all bins are 
identical. 

It is well known (see, e.g., [20]), that bin-packing does not belong to the class 
of NP-hard problems that possess a PTAS. However, there exists an asymptotic 
PTAS (APTAS), which uses (1 -I- e)OPT{I) -\- k bins for some fixed k. Vega and 
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Lueker presented an APTAS, with fc = 1 [25]. Alternatively, a dual PTAS, which 
uses OPT{I) bins of size (1 + e) was given by Hochbaum and Shmoys [11]. Such 
a dual PTAS can also be derived from the recent work of Epstein and Sgall [6] on 
multiprocessor scheduling, since BP is dual to the minimum makespan problem. 

Shachnai and Tamir considered in [21] a special case of the CCMK, in which 
s(m) = p{u) = 1, for all u G I. The paper presents an approximation algorithm 
that achieves a factor of cj{c+ 1) to the optimal, when Cj > c,\/ 1 < j < N. 
Recently, Golubchik et al. [10] derived a tighter bound of 1 — 1/(1 + for 
this algorithm, with a matching lower bound for any algorithm for this problem. 
They gave a PTAS for CCMK with unit sizes and values, and showed that this 
special case of the CCMK is strongly NP-hard, even if all the knapsacks are 
identical. These hardness results are extended in [22] 

In this paper we study the approximability of the CCBP and the CCMK 
problems. 

— We give a dual PTAS for CCBP which packs any instance, I, into m < 
OPT{I) bins of size 1 + e. 

— We present a PTAS for CCMK, whose running time depends on the number, 
t, of bin types in the instance^. Specifically, we distinguish between the case 
where t is some fixed constant and the general case, where t can be as large 
as N. In both cases, the profit is guaranteed to be at least (1 — e)OPT{I). 

— We show that the 0-1 class-constrained knapsack (CCKP) admits an FP- 
TAS. For the case where all items have the same value, we give an optimal 
polynomial time algorithm. Our FPTAS is based on a two-level dynamic 
programming scheme. As in the MK problem [3], when we use the FPTAS 
for CCKP to fill the knapsacks sequentially with the remaining items, we 
obtain a (2 -|- ^(-approximation for CCMK. 

— We show that GCCP is APX-hard, already for the case of a single knapsack, 
where all items have the same size and the same value. 

For the PTASs, we assume that M, the number of distinct colors of items, is 
some fixed constant. The FPTAS for the 0—1 CCKP is suitable also for instances 
in which the value of M depends on the input size. When solving the CCBP, we 
note that even if M > 1 is some fixed constant, we cannot adopt the technique 
commonly used for packing (see, e.g., [11,16,25]), where we first consider the 
large items (of size > e), and then add the small items. In the presence of class 
constraints, one cannot extend even an optimal placement of the large items into 
an almost optimal placement of all items. The best we can achieve when packing 
first the large items, is an APTAS, whose absolute bin-waste depends on M. 
Such an APTAS is given in [5]. 

Our results contain two technical contributions. We present (in Section 2.1) a 
technique for eliminating small items. This technique is suitable for any packing 
problem in which handling the small items is complex, and in particular, for 
class constrained packing. Using this technique, we transform an instance, I, 
to another instance, P, which contains at most one small item in each color. 
We then solve the problem (CCBP in our case) on /' and slightly larger bins, 
which is much simpler than solving on / with the original bins. Our second idea 

^ For other related work, see in [23]. 

® Bins of the same type have the same capacity, and the same number of compartments. 
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is to transform any instance of CCMK to an instance which contains 0{ign/£) 
distinct bin types. This reduction in the number of bin types is essential when 
guessing the partition of the bins to color-sets (see in Section 3.4). 

The rest of the paper is organized as follows. In Section 2 we give the dual 
PTAS for the CCBP problem. In Section 3 we consider the CCMK problem, and 
give PTASs for a fixed (Section 3.3) and an arbitrary number of bin types (Sec- 
tion 3.4). In Section 4 we consider the class-constrained 0-1 knapsack problem. 
Finally, in Section 5 we give the APX-hardness proof for GCCP. 

Due to space limitations, some of the proofs are omitted. Detailed proofs are 
given in the full version of the paper [23] . 

2 A Dual Approximation Scheme for CCBP 

In this section we derive a dual-approximation scheme for the CCBP problem. 
Let OPTb{I) be the minimal number of bins needed for packing an instance 
I. We give a dual PTAS for CCBP, that is, for a given e > 0, we present an 
algorithm, A^, which packs I into OPTb{I) bins; the number of different colors 
in any bin does not exceed C, and the sum of the sizes of the items in any bin 
does not exceed l-|-£. The running time of Ag is polynomial in n and exponential 
in M, We assume throughout this section that the parameter M > 1 is some 
constant. 

Our algorithm operates in two stages. In the first stage we eliminate small 
items, i.e., we transform / into an instance I' which consists of large items, 
and possibly one small item in each color. In the second stage we pack We 
show that packing /' is much simpler than packing I, and that we only need 
to slightly increase the bin capacities, by factor 1 -|- e. We show that a natural 
extension of known packing algorithms to the class-constrained problem, yields a 
complicated scheme. The reason is that without elimination, we need to handle 
the small items of each color separately. The elimination technique presented 
below involves some interaction between small items of different colors. Various 
techniques can be used to approximate the optimal solution for We adopt the 
technique presented by Epstein and Sgall [6] . Alternatively, we could use interval 
partition ([16,25]). 

Note that when there are no class constraints, our elimination technique can 
be used to convert any instance I into one which contains a single small item. 



2.1 Eliminating Small Items 

We now describe our elimination technique, and show that the potential ‘dam- 
age’ when solving the problem on the resulting instance, is small. For a given 
parameter, d > 0, denote by small the subset of items of sizes s(u) < 5. Other 
items are considered large. Our scheme applies the following transformation to 
a given instance I. 

1. Partition / into M sets by the colors of the items. 

2. For each color 1 < t < M, partition the set of small items of color i into 
groups: the total size of the items in each group is in the interval [d, 2S ) ; 
one group may have total size < 6. This can be done, e.g., by grouping the 
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small items greedily: we start to form a new group, when the total size of 

the previous one exceeds <5. 

The resulting instance, consists of non-grouped-items, which are the orig- 
inal large items of /, large grouped-items of sizes in [5, 26), and at most M small 
grouped-items (one in each color). Given a packing of I', we can replace each 
grouped item by the set of small items from which it was formed. In this process, 
neither the total size of the items nor the set of colors contained in each bin is 
changed. Hence, any packing of I' into m bins induces a packing of I into m 
bins. 

Our scheme constructs I', packs and then transforms the packing of I' 
into a packing of I. The construction of I' and the above transformation are 
linear in n. We note that our assumption that M is fixed is needed only in the 
second phase, when packing I' . Hence, this assumption can be relaxed when our 
elimination process is used for other purposes. 

Now, we need to bound the potential damage from solving the problem on 
I' (rather than the original instance I). Let 5'(H) denote the total size of the set 
of items A. 

Lemma 1. Given a packing of I in m bins of size 1, we can pack I' into m bins 
of size 1 -I- 26. 

In particular, for an optimal packing of I, we have: 

Corollary 1. Let OPTb{I , b) be the minimal number of bins of size b needed 
for packing an instance I, then OPTb{I' , 1 + 26) < OPTb{I, 1). 



2.2 Packing I' Using Epstein- Sgall’s Algorithm 

Epstein and Sgall [6] presented PTASs for multiprocessor scheduling problems. 
Given a parameter 5 > 0, their general approximation schema can be modified to 
yield a dual PTAS for bin packing, which packs any instance / into the optimal 
number of bins, with the size of each bin increased at most by factor l-|-(5. (This 
can be done using binary search on m, the minimal number of bins, as in [11]). 

The PTAS in [6] is based on partitioning each set of items, SCI, into 
O(^) subsets. The items in each subset have the same size. The small items 
are replaced by few items of small, but non-negligible sizes. Thus, each set of 
items is represented by a unique configuration of length O(^). The algorithm 
constructs a graph whose vertices are a source, a target and one vertex for each 
configuration. The edges and their weights are defined in such a way that the 
problem of finding an optimal packing is reduced to the problem of finding a 
“good” path in the graph. The time complexity of the algorithm depends on the 
size of the graph, which is dominated by the total number of configurations. For 
a given <5, the graph has size 0{n^) where / = fy. 

A natural extension of this algorithm for GGBP is to refine the configurations 
to describe the number of items from each size set and from each color. Such 
extension results in a graph of size 0{n^) where / = M{2M + \)'^/6‘^. The 
value of / increases from 81/d^ to M{2M + 1)^/5^ since the small items of 
each color are handled separately. In addition, the total size of small items of 
each color is rounded to the nearest multiple of |. Thus, if rounded items of 
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C colors are packed into the same bin, we may get in this bin an overflow of 
C|. Given that there is only one small item in each color, as in we can use 
the actual sizes of the small items. This decreases significantly the number of 
possible configurations and prevents any overflow due to rounding. 

Hence, given an instance consisting of large items and at most M small 
items, each of different color, we derive a simplified version of the PTAS in [6]. 
In this version, denoted by Aes- 

1. Each set of items A C /' is represented by a unique configuration of length 
o{§), which indicates how many items in A belong to each color and to 
each size class. Note that there is no need to round the total size of the small 
items; we only need to indicate which small items are included in the set (a 
binary vector of length M) . In terms of [6] , this means that there is no need 
to define the successor of a configuration, and no precision is lost because of 
small items. 

2. To ensure that each bin contains items of at most C colors, in the config- 
uration graph, we connect vertices representing configurations which differ 
by at most C colors. In terms of [6], the gap between two configurations, 
(ru, n"), (w, nf), is defined only if n" — n' is positive in entries that belong to 
at most C colors. 

We summarize in the next Lemma: 

Lemma 2. Let m be the minimal number of bins of size b needed for packing I' , 
then, for a given 5 > 0, Aes finds a packing of I' into m bins of size 6(1 -I- 5). 
The running time of Aes 0{n^) where / = |j. 



2.3 The Algorithm 

Let ^ = I ; b = 1 + 25. The algorithm proceeds as follows. 

1. Construct I' from I, using the algorithm in Section 2.1. 

2. Use Aes for packing I' into bins of size 6(1 -I- <5). Let to be the number of 
bins used. 

3. Ungroup the grouped items to obtain a packing of / in to bins of size 6(1 -I- 5). 



Theorem 1. The algorithm Ag uses at most OPTb{I) bins. The sum of the 
sizes of the items in any bin does not exceed 1 -I- e. The running time of A% is 
0{nf) where f = ^. 



3 Approximation Schemes for CCMK 

In this section we present a PTAS for CCMK. We employ the guessing approach 
developed in [3]. 
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3.1 The PTAS of Chekuri-Khanna for the MKP 

Let P{U) denote the value of a set C/ C I, and let OPT{I) be the value of an 
optimal solution when packing I. The PTAS in [3] is based on two steps: (t) 
Guessing items: identify a set of items U C I such that P{U) > (1 — e)OPT{I) 
and U has a feasible packing, {ii) Packing items: given such U, find a feasible 
packing of U' C U, such that P{U') > (1 — e)P{U). 

As shown in [3] , the original instance I can be transformed into an instance 
in which the number of profit classes is 0(ln n/e), and the profits are in the range 
[l^n/e], such that the loss in the overall profit is at most e from the optimal. 
Thus, step {i) requires guesses. In step (ii), U is transformed into 

an instance with Oihin/e^) size classes, and the bins are partitioned to blocks, 
such that the bins in each block have identical capacity, to within factor 1 + e. 
The items in U' C U are then packed into the bin blocks. 

3.2 An Overview of - A PTAS for CCMK 

For some t > 1, assume that there are t types of bins, such that bins of type 
j have the same capacity bj, and the same number of compartments, Cj. Let 
C = maxj Cj be the maximal number of compartments in any type of bins. We 
denote by Bj the set of bins of type j; let mj = \Bj\ be the number of bins 
in Bj. Our schema proceeds in three steps: (t) Guess a subset U C I of items 
that will be packed in the bins, such that P{U) > (1 — e)OPT{I). {ii) Guess a 
partition of the bins to at most r = O(^) color-sets, i.e., associate with each 
subset of bins the set of colors that can be packed in these bins. The resulting 
number of bin types is T < t ■ r. We call a subset of bins of the same type and 
color-set a block. {Hi) Find a feasible packing of U' C U in the bin blocks, such 
that P{U') > {l-e)P{U). 

In the first step we use the transformation in [3] to obtain an instance with 
M color classes and 0{\nn/e) profit classes. Thus, we can find the set U in 

0{nO(M/e^)) 

guesses. 

For the remaining steps, we first obtain an instance with Oiinn/ e^) size 
classes; distinguishing further between items by their colors, we can now assume 
that U consists of ft- < 0(Mlnn/e^) classes, i.e., U = U\U ■ ■ ■ U Uh, and the 
items in Ui are of the same size and color. The implementation of the second 
and the third steps varies, depending on t, the number of bin types. 

3.3 CCMK with Fixed Number of Bin Types 

Assume that t > 1 is some fixed constant. In the second step, we need to deter- 
mine Uji, the number of bins of type j associated with the Zth color-set. Thus, 
the number of guesses is 0{n^). 

The final step of our scheme is implemented as follows. Let ni = \Ui\ he the 
number of items in the class Lft. We first partition ni to 1 < T' < T subsets, 
where T' is the number of bin types, in which the items of Ui are packed. Note 
that if the number of items of Lft packed in bins of type j is smaller than ent / T, 
then these items can be ignored. This may cause an overall loss of at most a 
factor of e from the total profit of Lft, since we can take the (1 — e)ui most 
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profitable items. Therefore, assume that the number of items packed in each 
type of bins is sui/T < ruj < rii. We can now write as a multiple of enilT, 
and take all the pairs of the form {riij,j) (if riij is not a multiple of erii/T, again, 
we can remove the less-profitable items, with an overall loss of at most a factor 
of e from the total profit of Ui). Any partition of Ui among the T types of bins 
corresponds to a subset of these pairs. As we need to consider all the classes, 
the overall number of guesses is ). We can now use, for each type of 

bins, any known PTAS for bin packing (see, e.g., [16] whose time complexity is 
0(n/e^)) and take for each bin type the rrij most profitable bins. We summarize 
with the next result. 

Theorem 2. The CCMK with a fixed number of distinct colors, M > 1, and 
a fixed number of bin types, t > 1, admits a PTAS, whose running time is 
0(T„0(MTVeb). 



3.4 CCMK with Arbitrary Bin Sizes 

Suppose that the number of bin types is t = 0(lg n). A naive implementation of 
the second step of our scheme results in guesses. Thus, we proceed 

as follows. We first partition each set of bins of the same type to at most r 
color-sets. If the number of bins allocated to some color-set is not a multiple of 
emj / r, then we add bins and round this allocation to the nearest multiple of 
emj/r. Note that the total number of added bins is at most emj. Thus, after the 
placement is completed we can pick for each j the mj most profitable bins with 
an overall loss of at most a factor of e from the total profit. Hence, we assume 
that the number of bins allocated to the ?th color-set is a multiple of emj/r. 
Taking all the pairs (rriji, 1), we get that the number of guesses is 2^^'' The 
overall number of guesses, when taking all bin types, is 

In the third step we adapt the packing steps in [3] , with the bins partitioned 
to blocks, as defined above. We omit the details. 

Indeed, in the general case, t can be as large as N. In the following, we show 
that the set of bins can be transformed to a set in which the number of bin types 
is 0(lg(n/e)), such that the overall loss in profit is at most factor e. 

Lemma 3. Any set of bins with t bin-types, can be transformed into a set with 
0{\g{n/e)) distinct bin types, such that for any U Cl that has a feasible pack- 
ing in the original set, there exists U" C U that has a feasible packing in the 
transformed set, and P{U") > (1 — e)P{U). 

We proceed to apply the above PTAS on the resulting set of bins. Thus, 
Theorem 3. For any instance I of the CCMK, in which M > 1 is fixed, there is 
a PTAS which obtains a total profit of {1 — e)OPT{I) in time rpP ^ 

4 The Class-Constrained 0-1 Knapsack Problem 

In this section we consider the class-constrained 0-1 knapsack problem (CCKP), 
in which we need to place a subset S' of / in a single knapsack of size & € R. The 
objective is to pack items of maximal value from I in the knapsack, such that 
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the sum of the sizes of the packed items does not exceed b, and the number of 
different colors in the knapsack does not exceed C. 

Throughout this section we assume that the numbers C and M are given 
as part of the input (Otherwise, using an FPTAS for the classic 0-1 knapsack 
problem, we can examine all the (^) possible subsets of C colors). We discuss 
below two special classes of instances of the CCKP: (t) instances with color-based 
values, in which the items in each color have the same value (and arbitrary sizes), 
i.e., for 1 < t < M, the value of any item in color i is pi. (ii) instances with 
uniform values, in which all the items have the same value (regardless of their 
size or color). Note that for instances with non-uniform values the problem is 
NP-hard. Indeed, when C = M = n, i.e., there is one item in each color and no 
color-constraints, we get an instance of the classic 0-1 knapsack problem. 

We present an FPTAS for the CCKP, whose time complexity depends on 
the uniformity of the item- values. In particular, for uniform- value instances we 
get a polynomial time optimal algorithm. As in the FPTASs for the non class- 
constrained problem [14], and for the cardinality-constrained problem [1], we 
combine scaling of the profits with dynamic programming (DP). However, in 
the class-constrained problem we need two levels of DP recursions, as described 
below. 

4.1 An Optimal Solution Using Dynamic Programming 

Assume that we have an upper bound U on the optimal value of the solution to 
our problem. Given such an upper bound, we can formulate an algorithm, based 
on dynamic programming, to compute the optimal solution. The time complexity 
of the algorithm is polynomial in U, n and M. For each color i G {1, . . . , M}, let 
Hi denote the number of items of color i in the instance I. Thus, n = 

The items in each color are given in arbitrary order. 

The algorithm consists of two stages. In the first stage we calculate for each 
color i the value hi^k{a), for k = and a = 0,...,U: hi^k(a) is the 

smallest size sum of items with total value a, out of the first k items of color i. 

In the second stage of the algorithm, we calculate fi{a, £), for alii = 1, . . . , M; 
a = 0, . . . , [7, and £ = 1, . . . , C: fi{a,£) is the smallest size sum of items with 
total value a of £ colors out of the colors 1, . . . , i. The table fi can be calculated 
using the tables hi, VI < i < M. 

The optimal solution value for the problem is given by maxo=o,. f=o,...,c{o : 
fM{a,£) < b}. The time complexity of the recursion is 0{MCU‘^). Adding the 
time needed to construct the tables hi, which is 0{Un), we have a total of 
OiMCU"^ + nU). 

This time complexity can be improved to Of^^^UiUC) = 0(nUC) when 
our instance has color-based values. For uniform- value instances, we can assume 
w.l.o.g., that Vu € I,p{u) = 1. Since we pack at most n items, we can bound the 
maximal profit hy U = n, and we get an optimal 0{n^C) algorithm. 

4.2 FPTAS for Non-uniform Values 0-1 Knapsack Problem 

By using the pseudo-polynomial algorithm above we can devise an FPTAS for 
instances with non-uniform values. First, we need an upper bound on the value. 
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z* , of an optimal solution. Such an upper bound can be obtained from a simple 
^-approximation algorithm, Ar- Let be the profit obtained by Ar- Then, 
clearly, ^^z^ is an upper bound on the value of z*. 

We scale the item values by replacing each value pi by qi = \pin/sz ^~\ , where 
1 — £ is the required approximation ratio. Then, we set the upper bound U for 
the new instance to be |"|^] +n. Finally, we apply the DP scheme and return 
the optimal ‘scaled’ solution as the approximated solution for the non-scaled 
instance. As in [14,1], the combination of scaling with DP yields an FPTAS. 

Theorem 4. There is an FPTAS for CCKP, whose running time is 0(Mv? je) 
for color-based instances, and ^)) for arbitrary values. 

For the CCMK problem it is now natural to analyze the greedy algorithm, 
which packs the knapsacks sequentially, by applying the above FPTAS for a 
single knapsack to the remaining items. Recently, this algorithm was analyzed 
for the MK [3]. It turns out that the result presented in [3] for non-uniform 
knapsacks, can be adopted for the CCMK. Let Creedy(£) refer to this algorithm, 
with error-parameter e for the single knapsack FPTAS. Then, 

Theorem 5. Greedy{e) yields a {2 -\- e)- approximation for CCMK. 

As in [3], the bound is tight; also, the performance of the algorithm cannot be 
improved by ordering the knapsacks in non-increasing order by their capacities. 



5 Generalized Class-Constrained Packing 

Recall, that in GCCP, each item u G I is associated with a set c(u) of colors. 
Denote by c(j) the set of colors for which the knapsack Kj allocates compart- 
ments (|c(j)| < Cj). Then, u can be packed in Kj iff c{u) C c(j) yf 0. We show 
that the GCCP problem is APX-hard, that is, there exists £i > 0 such that it 
is NP-hard to decide whether an instance has a maximal profit P, or if every 
legal packing has profit at most (1 — £i)P. This hardness result holds even for 
a single knapsack, and for instances in which all the items have the same size 
and the same value. Moreover, for each item u € I, the cardinality of c{u) is a 
constant (at most 4). 

Theorem 6. The GCCP problem is APX-Hard, even for one knapsack and in- 
stances with uniform value and size. 

Remark 1. Another generalization of CCMK, in which the color of each item 
depends on the knapsack in which it is packed, is also APX-hard. 
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Abstract. We consider the problem of scheduling jobs online, where 
jobs may be served partially in order to optimize the overall use of the 
machines. Service requests arrive online to be executed immediately; the 
scheduler must decide how long and if it will run a job (that is, it must 
fix the Quality of Service level of the job) at the time of arrival of the 
job: preemption is not allowed. We give lower bounds on the competitive 
ratio and present algorithms for jobs with varying sizes and for jobs with 
uniform size, and for jobs that can be run for an arbitrary time or only 
for some fixed fraction of their full execution time. 



1 Introduction 

Partial execution or computation of jobs has been an important topic of re- 
search in several papers [2,4,5,6,7,8,9,12,13]. Problems that are considered are 
e. g. imprecise computation, anytime algorithms and two-level jobs (see below). 

In this paper, we study the problem of scheduling jobs online, where jobs 
may be served only partially in order to increase the overall use of the machines. 
This e. g. also allows downsizing of systems. The decision as to how much of a 
job to schedule has to be made at the start of the job. 

This corresponds to choosing the Quality of Service (QoS) in multimedia 
systems. One could e. g. consider the transmission of pictures or other multimedia 
data, where the quality of the transmission has to be set in advance (like quality 
parameters in JPEG), cannot be changed halfway and transmissions should not 
be interrupted. 

Another example considers the scheduling of excess services. For instance, 
a (mobile) network guarantees a basic service per request. Excess quality in 
continuous data streams can be scheduled instantaneously if and when relevant, 
and if sufficient resources are available (e. g. available buffer storage at a network 
node). 

Finally, when searching in multimedia databases, the quality of the search is 
adjustable. The decision to possibly use a better resolution quality on parts of 
the search instances can only be made on-line and should be serviced instantly 
if excess capacity is available [3]. 

In the paper, we consider the following setting. Service requests have to be 
accepted or rejected at the time of arrival; when (and if) they are accepted, 
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they must be executed right away. We use competitive analysis to measure the 
quality of the scheduling algorithms, comparing the online performance to that 
of an offline algorithm that knows the future arrivals of jobs. 

We first consider jobs with different job sizes. In that case, the amount by 
which the sizes can differ is shown to determine how well an algorithm can do: 
if all job sizes are between 1 and M, the competitive ratio is I7(ln M). We adapt 
the algorithm Harmonic from [1] and show a competitive ratio of O(lnM). 

Subsequently, and most important, we focus on scheduling uniform sized 
jobs. We prove a randomized lower bound of 1.5, and we present a deterministic 
scheduling algorithm with a competitive ratio slightly above — 1 « 1.828. 
Finally, we consider the case where jobs can only be run at two levels: a < 1 
and 1 . We derive a lower bound of 1 -I- a — . 

This is an extended abstract in which we do not give complete proofs. For 
more details, we refer to the full paper [10]. 

1.1 Related Work 

We give a short overview of some related work. 

In overloaded real-time systems, imprecise computation[Sfi,7] is a well-known 
method to ensure graceful degradation. On-line scheduling of imprecise compu- 
tation jobs is studied in [9,2], but mainly on task sets that already satisfy the 
(weak) feasible mandatory constraint: at no time may a job arrive which makes it 
infeasible to complete all mandatory subtasks (for the offline algorithm) . This is 
quite a strong constraint. Anytime algorithms are introduced in [5] and studied 
further in [13]. This is a type of algorithm that may be interrupted at any point, 
returning a result with a quality that depends on the execution time. 

In [4], a model similar to the one in this paper is studied, but on a single 
machine and using stochastic processes and analysis in stead of competitive 
analysis. Jobs arrive in a Poisson process and can be executed in two ways, full 
level or reduced level. If they cannot start immediately, they are put in a queue. 
The execution of jobs can either be switched from one level to the other, or 
it cannot (as is the case in our model). For both cases, a threshold method is 
proposed: the approach consists of executing jobs on a particular level depending 
on whether the length of the queue is more or less than a parameter M . The 
performance of this algorithm, which depends on the choice of M, is studied in 
terms of mean task waiting time, the mean task served computation time, and 
the fraction of tasks that receive full level computation. The user can adapt M 
to optimize his desired objective function. There are thus no time constraints 
(or deadlines) in this model, and the analysis is stochastic. In [12], this model is 
studied on more machines, again using probabilistic analysis. 



2 Definitions and Notations 

By n, we denote the number of machines. The performance measure is the total 
usage of all the machines (the total amount of time that machines are busy). 
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For each job, a scheduling algorithm earns the time that it serves that job. The 
goal is to use the machines most efficiently, in other words, to serve as many 
requests as possible for as long as possible. The earnings of an algorithm A on 
a job sequence a are denoted by A{a). The adversary is denoted by ADV. The 
competitive ratio of an algorithm A, denoted by r{A), is defined as 



r{A) = sup 

(7 



ADV{<j) 

^ 4 ^ 



3 Different Job Sizes 

We will first show that if the jobs can have different sizes, the competitive ratio 
of an online algorithm is not helped much by having the option of scheduling 
jobs partially. The most important factor is the size of the accepted and rejected 
jobs, and not how long they run. This even holds when the job sizes are bounded. 

Lemma 1. If job sizes can vary without bound, no algorithm that schedules jobs 
on n machines can attain a finite competitive ratio. 

Proof. Suppose there is a r-competitive online algorithm A, and the smallest 
occurring job size is 1. The following job sequence is given to the algorithm: 
xi = 1,X2 = r,Xi = r'^~^{i = 3, . . . , n),Xn+i = 2r(l + . . . + All jobs arrive 

at time t = 0. As soon as A refuses a job, the sequence stops and no more jobs 
arrive. 

Suppose A refuses job Xi, where i < n. Then A earns at most l + r+. . 
while the adversary earns 1 + r + . . . + We have 

l + r+... + r*“^ , — 1 

1 i ^ 1 T j — o = l + r — l = 'r. 

l + r+ ... + r* ^ l + r + ...+r® ^ 

This implies A must accept the first n jobs. However, it then earns at most 
1 + . . . + r"“^. The adversary serves only the last job and earns 2r times as 
much. □ 

Note that this lemma holds even when all jobs can only run completely. 

If for all job sizes x we have 1 < a; < M, we can use similar methods to those 
used in studying the video on demand problem studied in [1] to give lower and 
upper bounds for our problem. 

In [I], a central server has to decide which movies to show on a limited 
number of channels. Each movie has a certain value determined by the amount 
of people that have requested that movie, and the goal is to use the channels 
most profitably. 

Several technical adjustments in both the proof of the lower bound and in 
the construction of the algorithm Harmonic are required. We refer to the full 
paper [10] for details. 

Theorem 1. Let r be the optimal competitive ratio of this scheduling prob- 
lem with different job sizes. Then r = [2(lnM). For M = 12(2"), we have 
r = I7(n( {/M — 1)). Adapted Harmonic, which requires n = I2 {MHm), has a 
competitive ratio of 0{hi M). 
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4 Uniform Job Sizes 

We will now study the case of identical job sizes. For convenience, we take 
the job sizes to be 1. In this section we allow that the scheduling algorithm is 
completely free in choosing how long it serves any job. The simplest algorithm 
is Greedy, which serves all jobs completely if possible. Clearly, Greedy maintains 
a competitive ratio of 2, because it can miss at most 1 in earnings for every job 
that it serves. 

Lemma 2. For two machines and jobs of size 1, Greedy is optimal among algo- 
rithms that are free to choose the execution times of jobs between 0 and 1, and 
it has a competitive ratio of 2. 

Proof. We refer to the full paper [10]. 

We give a lower bound for the general case, which even holds for randomized 
algorithms. 

Theorem 2. For jobs of size 1 on n > 2 machines, no (randomized) algorithm 
that is free to choose the execution times of jobs between 0 and 1 can have a 
lower competitive ratio than 3/2. 

Proof. We use Yao’s Minimax Principle [11]. 

We examine the following class of random instances. At time 0, n jobs arrive. 
At time 0 < t < 1, n more jobs arrive, where t is uniformly distributed over the 
interval (0, 1] . The expected optimal earnings are 3n/2: the first n jobs are served 
for such a time that they finish as the next n jobs arrive, which is expected to 
happen at time 1/2; those n jobs are served completely. 

Consider a deterministic algorithm A and say A earns x on running the first 
n jobs (partially). If A has v(f) machines available at time t, when the next n 
jobs arrive, then it earns at most an additional v(t). Its expected earnings are 
at most X Jf.^Qv{t)dt = n, since is exactly the earnings that A 

missed by not serving the first n jobs completely: x = n — v{t)dt. Therefore 
r{A) >3/2. □ 

We now present an algorithm SL which makes use of the possibility of choos- 
ing the execution time. Although SL could run jobs for any time between 0 and 
1, it runs all jobs either completely {long jobs) or for ^\/2 of the time {short 
jobs). We denote the number of running jobs of these types at time t by l{t) and 
s{t). The arrival time of job j is denoted by tj. 

The idea is to make sure that each short job is related to a unique long 
job which starts earlier and finishes later. To determine which long jobs to use, 
marks are used. Short jobs are never marked. Long jobs get marked to enable 
the start of a short job, or when they have run for at least 1 — ^V2 time. The 
latter is because a new short job would always run until past the end of this 
long job. In the algorithm, at most sq = [(3 — V2)n/7~\ « 0.22654 • n jobs are 
run short simultaneously at any time. We will ignore the rounding and take 
So = (3 — -\/2)n/7 in the calculations. The algorithm is as follows. 
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Algorithm SL. If a job arrives at time t, refuse it if all machines are busy. 

If a machine is available, first mark all long jobs j for which t — tj > 1 — | -\/2. 
Then if s{t) < sq and there exists an unmarked long job x, run the new job for 
|-\/2 time and mark x. Otherwise, run it completely. 

Theorem 3. SL maintains a competitive ratio of 

r. ^ /K , 8V2-11 , 0.31371 

R = 2V2 - 1 + — « 1.8284 H , 

n n 



where n is the number of machines. 

Proof. We will give the proof in the next section. 

5 Analysis of Algorithm SL 

Below, we analyze the performance of algorithm SL, which was given in Section 
4, and prove Theorem 3. 



time 




Fig. 1. A run of SL. 



Consider a run of SL as in Figure 1. We introduce the following concepts. 

— A job is of type A if at some moment during the execution of the job, all ma- 
chines are used; otherwise it is of type B. (The jobs are marked accordingly 
in Figure 1.) 

— Lost earnings are earnings of the adversary that SL misses. (In Figure 1, 
the lost earnings are marked grey.) Lost earnings are caused because jobs 
are not run or because they are run too short. 

— A job or a set of jobs compensates for an amount x of lost earnings, if SL 
earns y on that job or set of jobs and {x + y) /y < R {or x/y < R — 1) . 1. e., 
it does not violate the anticipated competitive ratio R. 

A job of type B can only cause lost earnings when it is run short, because no 
job is refused during the time a job of type B is running. However, this causes 
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at most 1 — of lost earnings, so there is always enough compensation for 
these lost earnings from this job itself. 

When jobs of type A are running, the adversary can earn more by running 
any short jobs among them longer. But it is also possible that jobs arrive while 
these jobs are running, so that they have to be refused, causing even more lost 
earnings. We will show that SL compensates for these lost earnings as well. We 
begin by deriving some general properties oi SL. 

Note first of all that if n jobs arrive simultaneously when all of SL’s machines 
are idle, it serves sq of them short and earns ^soV^+{n— sq) = (6-|-5-\/2)n/14 « 

0.93365n. We denote this amount by xq- 

Properties of SL. 

1. Whenever a short job starts, a (long) job is marked that started earlier and 
that will finish later. This implies l{t) > s{t) for all t. 

2. When all machines are busy at some time t, SL earns at least xq from the 
jobs running at time t. (Since s{t) < so at all times.) 

3. Suppose that two consecutive jobs, a and b, satisfy that it — ta < I — 
and that both jobs are long. Then s(tb) = sg (and therefore s(ta) = so), 
because b was run long although a was not marked yet. 

Lemma 3. If at some time t all machines are busy, at most n — sg jobs running 
at time t will still run for or more time after t. 

Proof. Suppose all machines are busy at time t. Consider the set L of (long) 
jobs that will be running for more than 5-^/2 time, and suppose it contains 
X > n — sg + 1 jobs. We derive a contradiction. 

Denote the jobs in L by ji, . . . , jx, where the jobs are ordered by arrival time. 
At time tj^, the other jobs in L must have been running for less than 1 — 
time, otherwise they would finish before time t+ \\f^. This implies that jobs in 
L can only be marked because short jobs started. 

Also, if at time tj,^ we consider jx not to be running yet, we know not all 
machines are busy at time tj,^, or jx would not have started. We have 

n > s{tjJ + l{tjJ > s{tjJ +n-sg, 

so sftj,^) < Sg. Therefore, between times and at most sftj,,.) < sg — 1 
short jobs can have been started and as a consequence, less than sq jobs in L 
are marked at time tj^ . But then there is an unmarked job in L at time tj ^ , so 
jx is run short. This contradicts jx & L. □ 

Definition. A critical interval is an interval of time in which SL is using all 
its machines, and no jobs start or finish. 

We call such an interval critical, since it is only in such an interval that SL 
refuses jobs, causing possibly much lost earnings. From Lemma 3, we see that 
the length of a critical interval is at most 

We denote the jobs that SL runs during I by j{, . . . ,jf, where the jobs are 
ordered by arrival time. We denote the arrival times of these jobs by t{, . . . , I 
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starts at time We will omit the superscript I if this is clear from the context. 
We denote the lost earnings that are caused by the jobs in / by Xj] we also 
sometimes say simply that Xj is caused by I. We say that a job sequence ends 
with a critical interval, if no more jobs arrive after the end of the last critical 
interval that occurs in SUs schedule. 



Lemma 4. If a job sequence ends with a critical interval I, and no other jobs 
besides j(, ■ ■ ■ ,j„ arrive in the interval [t {, . . . , then SL can compensate for 
the lost earnings Xj. 



Proof. Note that ji is long, because a short job implies the existence of an earlier, 
long job in I by Property 1. SL earns at least xq from ji, . . . , by Property 2. 
There are three cases to consider, depending on the size and timing of j 2 - 



Case 1. j 2 is short. See Figure 2, where we have taken t 2 = 0. Note that ji must 




Fig. 2. j 2 is short. 



be the job that is marked when j 2 arrives, because any other existing jobs finish 
before I starts and hence before j 2 finishes. Therefore, t 2 — t\ < 1 — so 
before time t 2 the adversary and SL earn less than 1 — from job 1. After 
time t 2 , the adversary earns at most (1 + \\pi)n from j\, . . . ,jn and the jobs 
that SL refuses during I. We have 

(1 + ^^/2)r^ + (1 - ^ V2) = R-xo, 
so SL compensates for A/. 



Case 2. j 2 is long and t 2 — ti < 1 — \'J2. 

Since no job arrives between and j 2 , we have by Properties 3 and 1 that 
s{ti) = So and l{ti) > sq. Denote the sets of these jobs by Si and Li, respectively. 
All these jobs finish before I. (During I, SL does not start or finish any jobs.) 
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Case 2a. There is no critical interval while the jobs in Si and L\ are running. 

Hence, the jobs in S'! and Li are of type B. We consider the jobs that are 
running at time t\ and the later jobs. Note that Li contains at least sq jobs, say 
it contains x jobs. After time t\ the adversary earns at most 2n, because I ends 
at most at time -I- 1. SL earns ^sqV^ + x from Si and Li and at least xq on 
the rest. For the adversary, we must consider only the earnings on Si and Li 
before time ti; this is clearly less than ^soV^ + x. 

We have 



2n -I- + X 

xo + ^soV2 + X 



< R for X > So- 



This shows SL compensates for Xj (as well as for the lost earnings caused by 
Si and Li). 



Case 2b. There exists a critical interval before I which includes a job from Si 
or Li. Call the earliest such interval I 2 . If I 2 starts after ti, we can calculate as 
in Case 2a. Otherwise, we consider the earnings on each machine after the jobs 
in I 2 started. Say the first job in Si starts at time t' . We have tn — t' < 1. See 
Figure 4. 



max. 1 





Fig. 4. j 2 is long and there is another critical interval. 



Say I 2 contains x short jobs that are not in (0 < a; < sq)- Then it contains 
sq — x short jobs from Si, and therefore at least sq — a; (long) jobs from Li. This 
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implies it contains at most n — 2sq + x long jobs not from L\. It also implies 
there are x short jobs in Si which are neither in / nor in l 2 - 

Using these observations, we can derive a bound on the earnings of the ad- 
versary and of SL from the jobs in I 2 and later. We divide their earnings into 
parts as illustrated in Figure 4 and have that the adversary earns at most 

(2 + -V2)n (after t') 

+ n — 2sq + X (from the long jobs not in Li) 

+ (1 — ^V^ 2 )so (from Li before t') 

+ ^xV2 (from the short jobs not in ^i) 

= (3 -b -V2)n — (1 -b -V2)so + x{\ + 2 '^)’ 

while SL earns 2xq (from the jobs in / and I 2 ) -b^xv^ (from the x short jobs 
from 5*1 between I 2 and I). We have 

(3 -b — (1 “b ^\/2)sq -b x(l -b ^ 

2 xo -b \x\f2 

so SL compensates for all lost earnings after 12- 

Case 3. j 2 is long and t 2 — ti > 1 — \-j2. We consider job j’ 3 . 

If j 3 is short, then after time h + {1 — ^V2) the adversary earns at most 
(I -b 5 ^ 2)71 - (n - 2 )((t 3 - ti) - (1 - |v^)) - ((^2 - h) - (1 - ^V2)). Before 
that time, it earns of course (1 — (only counting the jobs in I). So in total, 
it earns less than it did in Case 1. 

If j 3 is long, we have two cases. If ^3 — ^2 < 1 — 5 V^, again the sets S'! and 
Li are implied and we are in Case 2. Finally, \i — 12 > ^ we know that 

t 4 — ts < 1 — \\[2, so this reduces to Case 1 or 2 as well. 

In all cases, we can conclude that SL compensates for Xj. □ 

Lemma 5. If a job sequence ends with a critical interval I, then SL can com- 
pensate for the lost earnings Xj. 

Proof. We can follow the proof of Lemma 4. However, it is now possible that a 
short job j[ starts after ji, but finishes before I. 

Suppose the first short job in I arrives at time t' = ti~\- x. If the job sets Si 
and Li exist, we can reason as in Case 2 of Lemma 4. Otherwise, all long jobs 
in I that arrive before time t'^ save one are followed by short jobs not in I . (If 
there are two such long jobs, they arrived more than 1 — \'\f2 apart, and the 
adversary earns less than in Case 1 of Lemma 4 (cf. Case 3 of that lemma).) 

For each pair (ai,bi), where Ui is long and bi ji I is short, we have that bi 
will run for at least ^\/2 — x more time after t' , while Ui has run for at most x 
time. One such pair is shown in Figure 5. 



V2) 



< i? for 0 < X < So 
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Fig. 5. Pairs of long and short jobs. 



We compare the adversary’s earnings now to its earnings in Case 1 of Lemma 
4. Since bi ^ I, it earns less on the machine running bi and more on the machine 
running ai (because there it earns something before time t' , which was not taken 
into account earlier). If a; < the adversary loses more on the machines 

running these pairs than it gains. On the other hand, if x > 1 — then / is 
shorter than the adversary earns x — {1 — less on every machine. □ 

It is possible that two or more critical intervals follow one another. In that 
case, we cannot simply apply Lemma 5 repeatedly, because some jobs may be 
running during two or more successive critical intervals. Thus, they would be 
used twice to compensate for different lost earnings. We show in the full paper 
that SL compensates for all lost earnings in this case as well. 

Definition. A group of critical intervals is a set of critical intervals, where 

li+i starts at most 1 time after A finishes {i = 1, . . . , k — 1). 

Lemma 6. If a job sequence ends with a group of critical intervals, SL com- 
pensates for all the lost earnings after the first critical interval. 

Proof. The proof consists of showing that in all cases, the lost earnings between 
and after the critical intervals are small compared to SL's earnings on the jobs 
it runs. A typical case is shown in Figure 6. For details, see [10]. □ 

Theorem 4. SL maintains a competitive ratio of R = 2\/2 — 1 -|- . 

Proof. If no jobs arrive within 1 time after a critical interval, the machines 
of both SL and the adversary are empty. New jobs arriving after that can be 
treated as a separate job sequence. Thus we can divide the job sequence into 
parts. The previous lemmas also hold for such a part of a job sequence. 

Consider a (part of) a job sequence. All the jobs arriving after the last crit- 
ical interval can be disregarded, since they are of type B: they compensate for 
themselves. Moreover, they can only decrease the amount of lost earnings caused 
by the last critical interval (if they start less than 1 after a critical interval). 

If there is no critical interval, we are done. Otherwise, we can apply Lemma 
6 and remove the last group of critical intervals from consideration. We can then 
remove the jobs of type B at the end and continue in this way to show that SL 
compensates for all lost earnings. □ 
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Fig. 6. A sequence of critical intervals. 



6 Fixed Levels 

Finally, we study the case where jobs can only be run at two levels [4,12]. This 
reduces the power of the adversary and should lower the competitive ratio. If 
the jobs can have different sizes, the proofs from Section 3 still hold. 

Theorem 5. Let r he the optimal competitive ratio of this scheduling problem 
with different job sizes and two fixed run levels. Then r = l7(lnM). For M = 
17(2"), we have r = — 1)). Adapted Harmonic, which requires n = 

Q^MHm), has a competitive ratio of 0{lnM). 

Proof. We refer to the full paper [10]. 

For the case of uniform jobs, we have the following bound. 

Theorem 6. If jobs can be run at two levels, a < 1 and 1, then no algorithm 
can have a better competitive ratio than 1 + a — . 

Proof. Note that each job is run either for 0, a or 1 time. Let n jobs arrive 
at time t = 0. Say A serves <j)n jobs partially and the rest completely. It earns 
{l — (j) + a(j))n. If this is less than n/(l + a — a^) we are done. Otherwise, we have 
^ . Another n jobs arrive at time t = a. A earns at most (1 + a(j))n in 

total, while the offline algorithm can earn n + na. Since </> < , we have 

r(A) > >l + a-a^. □ 

Note that for a = SL yields a competitive ratio for this problem of at 
most 1.828 (but probably much better). Extending these results to more values 
of a is an open problem. 

7 Conclusions and Future Work 



We have studied the problem of scheduling jobs that do not have a fixed ex- 
ecution time on-line. We have first considered the general case with different 
job sizes, where methods from [1] can be used. Subsequently, we have given a 
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randomized lower bound of 1.5 and a deterministic algorithm with competitive 
ratio « 1.828 for the scheduling of uniform jobs. An open question is by how 
much either the lower bound or the algorithm could be improved. Especially 
using randomization it could be possible to find a better algorithm. 

An extension of this model is to introduce either deadlines or startup times, 
limiting either the time at which a job should finish or the time at which it 
should start. Finally, algorithms for fixed level servicing can be investigated. 
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Abstract. We present factor | approximation algorithms for the prob- 
lems of finding the minimnm 2-edge connected and the minimum 2- vertex 
connected spanning subgraph of a given undirected graph. 



1 Introduction 

The task of finding small spanning subgraphs of a prescribed connectivity is a 
fundamental problem in network optimization. Unfortunately, it is also a hard 
problem. In fact, even the problems of finding the smallest 2-edge connected 
spanning subgraph (2EC) or the smallest 2-vertex connected spanning sub- 
graph (2VC) are NP-hard. This can be seen via a simple reduction from the 
Hamiltonian cycle problem. 

One approach is that of approximation algorithms. It is easy to find a solution 
that has no more than twice as many edges as the optimum; take the edges of a 
depth-first search tree on the graph along with the deepest back-edge from each 
vertex to obtain a 2-connected subgraph with at most 2n — 2 edges. Since the 
optimum has at least n edges, this is a 2-approximation. Khuller and Vishkin 
[6] gave a |-approximation algorithm for 2EC. Cheriyan et al [2] have recently 
improved upon this, with a ^-approximation algorithm. In [6], Khuller and 
Vishkin also gave a | approximation algorithm for 2VC . This was improved to 
I in [3]. 

This paper presents |-approximation algorithms for both 2EC and 2VC. 
The ratio | has a special significance; a celebrated conjecture in combinatorial 
optimization states that the traveling salesman problem on metrics is approx- 
imable to within | via a linear programming relaxation called the subtour re- 
laxation. A (previously unverified) implication of this conjecture is that 2EC is 
also approximable to within | . 

The algorithms are based upon decomposition theorems that allow us to 
eliminate certain structures from a graph. Once these problematic structures 
are prohibited, we simply find a minimum subgraph in which every vertex has 
degree at least two. This can be done in polynomial-time via a reduction to the 
maximum matching problem. We then show that the degree two subgraphs can 
be modified to obtain a solution to the 2EC or 2VC without increasing their 
size by more than a third. The bulk of the analysis lies is showing that such 
a low-cost modification is possible and involves detailed arguments for several 
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tricky configurations. The running time of the algorithm is dominated by the 
time to find a maximum matching. 



2 The Lower Bound 

The idea behind the algorithms is to take a minimum sized subgraph with min- 
imum degree 2 and, at a small cost, make it 2-connected. We will denote by 
D2 the problem of finding a minimum sized subgraph in which each vertex has 
degree at least two. Notice that solving D2 provides a lower bound for both 2EC 
and 2VC. 

Lemma 1. The size of the optimal solution to D2 is a lower hound on the size 
of the optimal solutions to 2EC and 2VC. 

Proof. Any 2-connected subgraph must have minimum degree two. □ 

This gives a possible method for approximating 2ECa,nd 2VC. Find an op- 
timal solution to D2. Alter the solution in some way to give a solution to give a 
2-connected graph. The rest of the paper is devoted to specifying the “some way” 
and proving that the method gives a solution whose size is at most | the size 
of the optimum solution for D2. Hence it is at most | the size of the optimum 
solution for 2EC or 2VC. The problem D2 can be solved exactly in polynomial 
time. One fast way to do this is to find a maximum cycle cover (partition into 
directed cycles and paths) and add an arc to the end vertices of each path. It is 
not difficult to show that this will give an optimal solution to D2. The maximum 
cycle cover problem has simple reduction to maximum matching so can be solved 
in time Ofnf"^). 

How good a lower bound is our solution to D2 though? The following exam- 
ples show that, on its own, it is not sufficient for our purposes. Consider finding 
a minimum 2-edge connected subgraph in the graph shown Fig. li). It is easy 
to see that the optimal solution is | times larger than the optimal solution to 
D2. Similarly if we wish to find a minimum 2- vertex connected subgraph in the 
graph in Fig. lii), it is easy to see that this is also | times larger than the optimal 
solution to D2. We will address the implications of these examples in the next 
section. 




Fig. 1. Examples for which the lower bound is not sufficient. 
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3 The Decomposition Theorems 

3.1 Edge Connectivity 

Cut Vertices. Dealing with the difficulty in the example given in Fig. li) is 
not hard. We assume that for the problem 2EC our graph contains no cut ver- 
tices. This condition entails no loss of generality. An a-approximation algorithm 
for 2- vertex connected graphs can be used to give a-approximate solutions for 
general graphs. Simply apply the algorithm separately to each maximal 2- vertex 
connected block, then combine the solutions to each block. 

We may also assume that G has no adjacent degree 2 vertices. This also 
entails no loss of generality. Observe that both edges incident to a degree two 
vertex must be contained in any 2-edge connected spanning subgraph. Hence, 
for a path whose internal vertices are all degree 2, every edge in the path must 
be chosen. This allows us to transform an instance of the general case into an 
instance with non-adjacent degree 2 vertices. Given G, contract, to just two 
edges, any such path. Call the resulting graph G' . Given the trivial bijection 
between solutions to G and solutions to G', it follows that any a-approximate 
solution to G' induces an a-approximate solution to G. 



Beta Structures. The example of Fig. lii) is more troublesome. To counter it 
we need to eliminate a certain graph structure. A Beta structure arises when the 
removal of two vertices v\ and V 2 induces at least three components. Here, we are 
concerned with the specific case in which one of the three induced components 
consists of a single vertex. We call such a vertex u a Beta vertex. This situation is 
illustrated in Fig. 2, with the other two components labeled Gi and G 2 . A Beta 
pair arises when the third induced component consists of two adjacent vertices. 
A graph without a Beta vertex or pair will be termed Beta-free. 




Fig. 2. A Beta vertex. 



Our basic technique to deal with a Beta structure is as follows. First we will 
find a Beta pair or vertex. Then we will decompose the graph into two pieces 
around the Beta structure and iterate the procedure. Eventually we will have 
a decomposition into Beta-free graphs. We will then work on these pieces and 
show that the sub-solutions obtained can be combined to give a good solution 
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to the whole graph. We present all the details of our method for Beta vertices; 
similar arguments apply to Beta pairs. 

Given a Beta vertex, u, with components Ci and C2, let Gi be the sub- 
graph formed from G by removing C2 and its incident edges. Let G'l be the 
graph formed by contracting V — G\. G2 and G'2 are defined analogously. Let 
ki,k[, k2 and k'2 be the sizes of the optimal 2 -edge connected spanning subgraphs 
of G\,G'i,G2 and G'2, respectively. Let OPT( 2 ifG) be the size of the optimal 
solution to the whole graph. 



Lemma 2 . For any 2 -edge eonnected graph, OPT( 2 ifG) = Mh\{ki-\-k'2, fcj -1-/02)- 



Proof. It is clear that the combination of a solution to Gi and G'2, or vice versa, 
gives a 2 -edge connected subgraph for G. Conversely any solution to G can be 
decomposed to give a solution to Gi and G'2, or to give a solution to G'l and 
G2. □ 



To compute optimally the values k\,k'i, /c2 and k'2 is as hard as the original 
problem. However, the idea also applies to approximate solutions. Let ri,r'j^,r2 
and r'2 be a-approximations to k\,k'i,k2 and k'2, respectively. The next result 
follows. 



Lemma 3 . Min(ri -|- r'2,r'i -\- r2) < a ■ OPT( 2 i?G). □ 

This suggests a decomposition procedure where we find all four values ri , r'l , 
T2, T2 and use the two that lead to the solution of smaller cost. We may assume 
that |Gi| < IG2I. We still have a problem, namely, calculating both T2 and r'2 
would lead to an exponential time algorithm in the worst case. We will apply a 
trick that allows us to compute exactly one of r2 and r'2, as well as both r\ and 
r'l, and hence obtain a polynomial-time algorithm. 

First note that to connect up u we must use both its incident edges. It follows 
that k'^ 2 < ki < k'^ 3 , i = 1 , 2 . This is because, given a 2 -edge connected 

subgraph for AT', we are forced to use 2 extra edges in connecting up u and we 
may need to use an extra edge from Gi in certain instances in which the solution 
to G' had all its edges leaving Gi going to the same vertex (either v\ or V2). 

Now find ri and r(. Again we have r' -I- 2 < ri < r' -I- 3 . If the solutions to Gi 
and G'l given by the algorithm didn’t satisfy these inequalities then we can alter 
one of the solutions to satisfy the inequalities. For example, if ri > r( -|- 3 then 
we may take the solution to G'l and create a new solution to G\ using at most 
three extra edges. So just by computing such an r\ and r'^ we can automatically 
find the minimum of the two sums (ri -|- r'2 or rj -I- r2). Hence we can decide 
which of r2 or r'2 we need to compute. Along with a procedure for Beta-free 
graphs this leads to the following recursive algorithm, A{G), for a graph G: 
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If G has a Beta structures, 

Then 

Let Cl and C 2 be the Beta structure components, with |Ci| < IC 2 I . 

Let ri = |^(Gi)| and r'l = |^(G'i)| . 

If ri = ri + 2, 

Then output ^(Gi)U.4(G2) 

Else output ,4(Gi ) U ,4(G2) 

Else use the procedure for Beta-free graphs. 

Theorem 1. Algorithm A runs in polynomial time. 

Proof. Let /(n) be the running time for our algorithm on a graph of n vertices. If 
G is Beta-free then it turns out that the running time is For graphs with 

Beta structures, the running time is at most /(n) = /(n— a)-|-/(a-l-l)-l-/(a-|-3)-|- 
0(n), where a < ^n. Hence, /(n) < max(/(n — a) -I- 2/(a -I- 3) -I- 0(n), 

Solving the recurrence we get that /(n) = 0(n^-^). □ 

Thus we are left with Beta-free graphs. It is in these graphs that we will solve 
the problem D2. 



з. 2 Vertex Connectivity 

Beta Structures. Consider the case of a Beta vertex u for the 2-vertex connec- 
tivity problem. We deal with this situation as follows. Given G, let G' = G — u. 
Given that the two edges incident to u are forced in any solution to G, we have 
the following simple lemma. 

Lemma 4. Consider the problem 2VC on a graph G that contains a Beta vertex 

и. An a -approximation for G can he obtained from an a- approximation for G' 

by adding the arcs incident to u. □ 

So for 2VC our decomposition is straightforward. Remove the Beta vertex 
and consider the induced subproblem. 



4 The Tree of Components 

Recall that we observed that that we could find an optimal solution to D2 
by adding arcs to the end vertices of each path in an optimal solution to the 
maximum cycle cover problem. We take the decomposition into path and cycle 
components obtained from the maximum cycle cover problem and attempt to 
add edges to make the graph 2-edge or 2- vertex connected. A key observation 
here is that associated with each path component are two “free” edges available 
from our lower bound. We grow a depth first search tree, T, with the path and 
cycle components as vertices. Gall the vertices of the tree “nodes” to distinguish 
them from the vertices of our original graph. Arbitrarily choose a node to be 
the root node. Since each node represents a component in the graph we need a 




Factor | Approximations for Minimum 2-Connected Subgraphs 267 



priority structure in order to specific the order in which edges from each node 
are examined. So suppose we enter the node along a tree edge (u,v). We have 
two cases. 

1. The node represents a cycle: first search for edges incident to vertices adja- 
cent to V in the cycle; next consider the vertices at distance 2 from v along 
the cycle, etc. Finally consider v itself. 

2. The node represents a path: give priority to the “left” endpoint, with declin- 
ing priority as we move rightwards. 

Building the tree in a depth first search manner gives the property that every 
edge in the graph is between a descendant node and one of its ancestor nodes 
(except for edges between vertices in the same node). We will call such edges 
“back edges” signifying that they go back towards the root. Given the tree edges 
and the components, it remains to 2-connect up the graph. The ideas needed 
here are similar for the case of both edge and vertex connectivity. 



5 Edge Connectivity 

Given our current subgraph, these path and cycle components within a tree 
structure, how do we create a small 2-edge connected subgraph? We attempt to 
2-connect up the tree, working up from the leaves, adding extra back edges as 
we go. Each edge that we add will be charged to a sub-component. Our proof 
will follow from the fact that no component is charged more than one third its 
own weight. 

Each component has one tree edge going up towards the root node. This edge 
we will call an upper tree edge w.r.t. the component. The component may have 
several tree edges going down towards the leaf nodes. These will be called lower 
tree edges w.r.t. the component. Initially each non-root component is charged 
one. This charge is for its incident upper tree edge. 

Lemma 5. For nodes representing cycles the charge is at most 2. 

Proof. If we examine such a component then we may add any back edge from 
the node, in addition to the upper tree edge. This gives a charge of two to the 
component. □ 

Notice that if the cycle component contains at least 6 edges then the associ- 
ated charge is at most one third the components weight. It remains to deal with 
cycles of at most 5 edges, and with paths. Gonsider the case of small cycles. Our 
aim is to show that in picking an extra back edge to 2-connect up such cycles 
we can remove some edge. In effect, we get the back edge for free and the charge 
to the node is just one. 

Lemma 6. For cycle nodes of size 3, the charge is one. 
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Proof. We will label a back edge from the node going up the DFS tree by b. 
A back edge from below that is incident to the node will be labeled /. The 
deepest back edge from the descendants of the node will be labeled d. In all of 
our diagrams we use the following scheme. Solid edges were already selected at 
the point of examination of the component; dashed edges were not. Bold edges 
are selected by the algorithm after the examination; plain edges are not. 

I. Lower tree edges are all adjacent at the same vertex as the upper tree 
edge. Assume this vertex is v\ . Note that neither V2 nor V3 have incident edges 
going down the tree, otherwise the DFS algorithm would have chosen them as 
tree edges. Now v± is not a cut vertex, so there must be a back edge b from V2 
or V3. Suppose it is incident to V3; the other case is similar. We may add b and 
remove the edge (viyVs) to 2-connect upwards the node. This is shown in Fig. 
3.1). 

I) > 11)1) , 11)2) 




Fig. 3. 3 cycles. 



II. The upper tree edge is incident to a vertex which has an adjacent vertex 
that is incident to a lower tree edge. 

1. There is a back edge b from V2 or V3. We may assume it is from V3. The 
other case is similar. Add b and remove the edge (ui,r'3). This is shown in 
Fig. 3.11)1). 

2. Otherwise, since v\ is not a cut vertex, there is an edge d going beyond v\. 

Add d and remove {v\,V2). See Fig. 3.11)2). □ 



Lemma 7. For cycle nodes of size 4, the charge is one. 

Proof. I. Lower tree edges are all adjacent at the same vertex as the upper tree 
edge. 

1. There is a back edge b from V2 or V4.. We may assume it is from V4; the other 
case is similar. Add b and remove the edge (ui,U4). This is shown in Fig. 
4.1)1). 

2. Otherwise, since v\ is not a cut vertex, there is a back edge b incident to 
V3. In addition, V4 is not a Beta vertex. Therefore, as there is no edge from 
Vi going down the tree (otherwise it would be a tree edge) there must be 
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an edge between V 2 and V 4 . Add b and (v 2 ,t' 4 )- Remove {vi^va) and {v 2 ,v^). 
See Fig. 4.1)2). 




Fig. 4. 4 cycles. 



II. The upper tree edge is incident to a vertex which has an adjacent vertex 
that is incident to a lower tree edge. 

1. There is an edge d going beyond v\. Add d and remove (vi,V 2 )- See Fig. 

4.11) 1). 

2. There is a back edge b from V2 or U 4 . We may assume it is from W 4 ; the other 
case is similar. Add b and remove the edge (ui,U 4 ). This is shown in Fig. 

4.11) 2)). 

3. Otherwise, since vi is not a cut vertex, there is a back edge b incident to V 3 . 
In addition, V 4 is not a Beta vertex. Therefore, either there is an an edge 
between V 2 and V 4 or there must be an edge / from V 4 going down the tree. 
In the former case. Fig. 4.II)3)a), add b and (v 2 ,V 4 ). Remove {vi.va) and 
{v 2 ,vz)- In the latter case. Fig. 4II)3)b), add b and /. Remove (ui,W 4 ) and 
{v2,vz). 

III. The upper tree edge is incident to a vertex which has an opposite vertex 
that is incident to a lower tree edge, but adjacent vertices are not incident to a 
lower tree edge. 

1. There is a back edge b from V2 or V4. We may assume it is from V4. Add b 
and remove the edge (ui,t> 4 ). 

2. Otherwise, note that V 4 is not a Beta vertex. Therefore, as there is no edge 
from V 4 going down the tree (otherwise it would be a tree edge) there must 
be an edge between V2 and V4. In addition, since vi is not a cut vertex. 
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there is either an edge d going beyond v± or there is a back edge b incident 
to V3. We deal with both cases in a similar fashion, see Fig. 4.III)2)a) and 
4.III)2)b). Add (v2,V4) and either d or b. Remove {vi,V4) and (r'2,^'3)- □ 

Lemma 8. The amortized charge on a 5-cycle is at most □ 

Lemma 9. For path nodes the charge is at most [I] + 2, where k is the number 
of edges in the path. 

Proof. We prove the result for the specific case in which the path P is a leaf 
node in the DFS tree T. The general case is similar. Let the upper tree edge e 
for P be incident to a vertex v on the path. Begin by contracting G — P into 
the vertex v. We are left with a path P' which we attempt to 2-edge connect 
up. After 2-edge connecting P' we have two possibilities. There are at least two 
edges chosen between P and G — P, one of which is the edge e. Otherwise e is 
the only edge chosen between P and G — P. In the former case we are done; P is 
already 2-edge connected upwards. In the latter case we need to add one extra 
edge. So in the worst case the cost of dealing with P is the cost of dealing with 
P' plus two edges. 

So consider P'. Start with the leftmost vertex in the path. We add rightward 
edges wherever a path edge threatens to be a bridge. On adding an edge it may 
be possible to remove some current edge. If not, then all of the vertices to the left 
of the current bridge, not contained in any other block, form a new block. The 
blocks formed have the property that, except for the blocks containing the path 
endpoints, they contain at least three vertices. To see this, let {uq,ui) be the 
current bridge, with the last block closed at uq. Let U2 and U3 be the two vertices 
to the right of ui. Choose the edge, e, from the last block reaching furthest to 
the right. Now e is not incident to ui, otherwise ui is a cut vertex. If e is incident 
to U2 then there is a rightward edge e' out of mi, otherwise U2 is a cut vertex. 
We can choose e' for free by dropping the edge (ui,U2), see Fig. 5. If e goes to 
U3 or beyond then we are already done. 




Fig. 5. Paths. 



We have one proviso. The vertex v that was originally incident to G — P 
now represents multiple vertices. As such, v may represent a cut vertex in P' . 
However, whilst the arguments above with respect to cut vertices may not now 
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apply at this step, it does mean that an edge from P to G — P will be chosen at 
this point in order to 2-edge connect up P' . Thus, in affect, we save an edge as 
we will not need to add an extra edge from P to G — P at the end of the process. 

Now, since the block containing the right endpoint does not contribute an 
extra edge, to deal with P we add, at most, an extra [|] -I- 2 edges. Recall, that 
associated with P we have two free edges, (i.e. k + 2 edges in total), so these 
extra two edges are accounted for. Note that |"|] < ^{k + 2) so the associated 
blow-up factor for each path is at most |. □ 

Corollary 1 . The algorithm is a ^-approximation algorithm for 2EG. □ 

6 Vertex Connectivity 

Our approach for the problem 2VG is similar to that of 2EG. Given our tree of 
components we attempt to 2- vertex connect up the graph using a limited number 
of extra edges. In doing so, though, we maintain the following property, where, 
given the tree of components T, we denote by Tq the subtree in T rooted at a 
component G. 

Property 1 : After dealing with component C we require that at least two 
edges cross from Tc to G — Tc. In addition these two edges must be disjoint. 

Lemma 10 . For cycle nodes of size 3 , the charge is one. 

Proof. We will use the following notation. Call the two back edges from the 
subtree below the 3-cycle a and a'. Let them originate at the vertices zi and Z 2 . 
We will presume that a is the lower tree edge for the 3-cycle. In contrast to the 
2EG case our arguments for 2VG are based upon both of the edges from below, 
not just one of them. Let the upper tree edge from the 3-cycle be e = (vi,yi). In 
the course of these proof we will take j/2 to be a generic vertex in T — (Tc U j/i). 
We will assume that any back edge leaving Tc, that is not incident to j/i, is 
incident to the same vertex j/2- This is the worst case scenario. The proofs are 
simpler if this does not happen to be the case. 

I. Lower tree edges are all adjacent at the same vertex, vi, as the upper tree 
edge. Notice that a' is not incident to the 3-cycle. It can not be incident to vi 
as a is. It is not incident to V 2 or V 3 as otherwise it would have been chosen as 
a lower tree edge not a. 

1 . Edge a' is incident to a vertex 7/2 7^ 2/i • Since vi is not a cut vertex there is 
a back edge e! from either V 2 or V 3 . These cases are symmetric, the latter is 
shown in Fig. 6.1). We may add e' and remove (ui,W3) for a net charge of 
one to the cycle. 

2. Edge a' is incident to yi. Again there must be a back edge e' from V 2 or V 3 . 
If e! is incident to 7/2 > add e! and remove e and (771,773) for zero net charge 
to the cycle. See Fig. 6.2)a). If e! is incident to 7/1, add e' and remove e and 
(771,773). We still have one spare edge available. Choose any back edge that 
is incident to some 7/2 7^ t/i- See Fig. 6.2)b). 
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2)a) 




2)b) 




Fig. 6. 



II. The upper tree edge e is incident to vi and the lower tree edge a is incident 
to an adjacent vertex V 2 - This situation we deal with in a similar manner to that 
previously described. 



Lemma 11. For cycle nodes of size 4 or 5 the charge is one. 



□ 



Lemma 12. For path nodes the charge is at most \ |] +2, where k is the number 
of edges in the path. □ 

Corollary 2. The algorithm is a ^-approximation algorithm for 2VC. □ 

7 Comparison with the Subtour Relaxation 

Consider the following linear programming relaxation for 2EC. 



min X) Xe 
eeE 

a^e > 2 


VS CV,S^0 


(1) 


eGS(S) 

Xe > 0 


Ve€E 


(2) 



It may be noted that the LP relaxation is similar to the familiar subtour 
relaxation for the Traveling Salesman Problem (TSP). The subtour relaxation 
imposes the additional constraints that X^eG(5({ij}) Vr G V. For metric 

costs, though, there is an optimal solution to the the LP relaxation which is 
also optimal for the sub tour relaxation. It has been conjectured that for a graph 
with metric costs, OPT(TSP) < | OPT(LP). One implication of this conjecture 
is that for a 2-edge connected graph, OPT(2i?C') < | OPT(LP). Carr and Ravi 
[1] provide a proof of the implication for the special case of half-integral solutions 
of the relaxation. Unfortunately, the implication (in its entirety) does not follow 
simply from our result. This is due to the fact that our lower bound, based upon 
integral degree 2 subgraphs, may be stronger in some situations than the bound 
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proffered by the linear program. The graph in Fig. 7a) is an example. Figure 
7b) shows the half integral optimal solution given by the LP. Here, edges given 
weight 1 are solid whilst edges of weight | are dashed. This has total value 9, 
whereas the minimal integral degree 2 subgraph, Fig. 7c), has value 10. A weaker 
implication of the TSP conjecture is that 2EC is approximable to within |, and 
our result does indeed verify this implication. 



a) b) c) 




Fig. 7. Subtour vs integral degree 2 subgraph. 



8 Conclusion 

How far can we go with the approach described here? For 2EC, there are two 
bottlenecks: (i) paths and (ii) cycles of short length. When the graph has no 
Beta structures, but has a Hamiltonian path, there is an example where the 
minimum 2-edge connected subgraph is of size |n. If D2 has 4-cycles, then our 
approach leads to a | or worse approximation. One way out might be to find 
a solution to D2 that does not contain any 4-cycles. The complexity of this 
problem is unresolved. Finding a cycle cover (or a solution to D2) with no 5- 
cycles is A^P-hard. Thus, even if one found a way to improve the approximation 
for paths, and to solve D2 with no 4-cycles, | does indeed seem to be a barrier. 
Similar techniques may also be applied to develop approximation algorithms for 
the analogous problems in directed graphs. 
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